Copying Changes Between Branches

Now you and Sally are working on parallel branches of the project: you're working on a private branch, and Sally is working on the trunk, or main line of development.

For projects that have a large number of contributors, it's common for most people to have working copies of the trunk. Whenever someone needs to make a long-running change that is likely to disrupt the trunk, a standard procedure is to create a private branch and commit changes there until all the work is complete.

So, the good news is that you and Sally aren't interfering with each other. The bad news is that it's very easy to drift too far apart. Remember that one of the problems with the “crawl in a hole” strategy is that by the time you're finished with your branch, it may be near-impossible to merge your changes back into the trunk without a huge number of conflicts.

Instead, you and Sally might continue to share changes as you work. It's up to you to decide which changes are worth sharing; SVK gives you the ability to selectively “copy” changes between branches. And when you're completely finished with your branch, your entire set of branch changes can be copied back into the trunk.

Copying Specific Changes

In the previous section, we mentioned that both you and Sally made changes to integer.c on different branches. If you look at Sally's log message for revision 344, you can see that she fixed some spelling errors. No doubt, your copy of the same file still has the same spelling errors. It's likely that your future changes to this file will be affecting the same areas that have the spelling errors, so you're in for some potential conflicts when you merge your branch someday. It's better, then, to receive Sally's change now, before you start working too heavily in the same places.

It's time to use the svk merge command. This command, it turns out, is a very close cousin to the svk diff command (which you read about in Chapter 3). Both commands are able to compare any two objects in the repository and describe the differences. For example, you can ask svk diff to show you the exact change made by Sally in revision 82:

$ svk diff -r 81:82 //calc/trunk

=== integer.c
===================================================================
--- integer.c	(revision 81)
    integer.c	(revision 82)
@@ -147,7  147,7 @@
     case 6:  sprintf(info->operating_system, "HPFS (OS/2 or NT)"); break;
     case 7:  sprintf(info->operating_system, "Macintosh"); break;
     case 8:  sprintf(info->operating_system, "Z-System"); break;
-    case 9:  sprintf(info->operating_system, "CPM"); break;
     case 9:  sprintf(info->operating_system, "CP/M"); break;
     case 10:  sprintf(info->operating_system, "TOPS-20"); break;
     case 11:  sprintf(info->operating_system, "NTFS (Windows NT)"); break;
     case 12:  sprintf(info->operating_system, "QDOS"); break;
@@ -164,7  164,7 @@
     low = (unsigned short) read_byte(gzfile);  /* read LSB */
     high = (unsigned short) read_byte(gzfile); /* read MSB */
     high = high << 8;  /* interpret MSB correctly */
-    total = low   high; /* add them togethe for correct total */
     total = low   high; /* add them together for correct total */
 
     info->extra_header = (unsigned char *) my_malloc(total);
     fread(info->extra_header, total, 1, gzfile);
@@ -241,7  241,7 @@
      Store the offset with ftell() ! */
 
   if ((info->data_offset = ftell(gzfile))== -1) {
-    printf("error: ftell() retturned -1.\n");
     printf("error: ftell() returned -1.\n");
     exit(1);
   }
 
@@ -249,7  249,7 @@
   printf("I believe start of compressed data is %u\n", info->data_offset);
   #endif
   
-  /* Set postion eight bytes from the end of the file. */
   /* Set position eight bytes from the end of the file. */
 
   if (fseek(gzfile, -8, SEEK_END)) {
     printf("error: fseek() returned non-zero\n");

The svk merge command is almost exactly the same. Instead of printing the differences to your terminal, however, it applies them directly to your working copy as local modifications:

$ svk merge -r 81:82 //calc/trunk
U  integer.c
$ svk status
M  integer.c

The output of svk merge shows that your copy of integer.c was patched. It now contains Sally's change—the change has been “copied” from the trunk to your working copy of your private branch, and now exists as a local modification. At this point, it's up to you to review the local modification and make sure it works correctly.

In another scenario, it's possible that things may not have gone so well, and that integer.c may have entered a conflicted state. You might need to resolve the conflict using standard procedures (see Chapter 3), or if you decide that the merge was a bad idea altogether, simply give up and svk revert the local change.

But assuming that you've reviewed the merged change, you can svk commit the change as usual. At that point, the change has been merged into your repository branch. In version control terminology, this act of copying changes between branches is commonly called porting changes.

When you commit the local modification, make sure your log message mentions that you're porting a specific change from one branch to another. For example:

$ svk commit -m "integer.c: ported r344 (spelling fixes) from trunk."
Sending        integer.c
Transmitting file data .
Committed revision 360.

As you'll see in the next sections, this is a very important “best practice” to follow.

A word of warning: while svk diff and svk merge are very similar in concept, they do have different syntax in many cases. Be sure to read about them in Chapter 9 for details, or ask svk help. For example, svk merge requires a working-copy path as a target, i.e. a place where it should apply the tree-changes. If the target isn't specified, it assumes you are trying to perform one of the following common operations:

  1. You want to merge directory changes into your current working directory.

  2. You want to merge the changes in a specific file into a file by the same name which exists in your current working directory.

If you are merging a directory and haven't specified a target path, svk merge assumes the first case above and tries to apply the changes into your current directory. If you are merging a file, and that file (or a file by the same name) exists in your current working directory, svk merge assumes the second case and tries to apply the changes to a local file with the same name.

If you want changes applied somewhere else, you'll need to say so. For example, if you're sitting in the parent directory of your working copy, you'll have to specify the target directory to receive the changes:

$ svk merge -r 343:344 http://svk.example.com/repos/calc/trunk my-calc-branch
U   my-calc-branch/integer.c

The Key Concept Behind Merging

You've now seen an example of the svk merge command, and you're about to see several more. If you're feeling confused about exactly how merging works, you're not alone. Many users (especially those new to version control) are initially perplexed about the proper syntax of the command, and about how and when the feature should be used. But fear not, this command is actually much simpler than you think! There's a very easy technique for understanding exactly how svk merge behaves.

The main source of confusion is the name of the command. The term “merge” somehow denotes that branches are combined together, or that there's some sort of mysterious blending of data going on. That's not the case. A better name for the command might have been svk diff-and-apply, because that's all that happens: two repository trees are compared, and the differences are applied to a working copy.

The command takes three arguments:

  1. An initial repository tree (often called the left side of the comparison),

  2. An final repository tree (often called the right side of the comparison),

  3. A working copy to accept the differences as local changes (often called the target of the merge).

Once these three arguments are specified, the two trees are compared, and the resulting differences are applied to the target working copy as local modifications. When the command is done, the results are no different than if you had hand-edited the files, or run various svk add or svk delete commands yourself. If you like the results, you can commit them. If you don't like the results, you can simply svk revert all of the changes.

The syntax of svk merge allows you to specify the three necessary arguments rather flexibly. Here are some examples:

      
$ svk merge http://svk.example.com/repos/branch1@150 \
            http://svk.example.com/repos/branch2@212 \
            my-working-copy
            
$ svk merge -r 100:200 http://svk.example.com/repos/trunk my-working-copy

$ svk merge -r 100:200 http://svk.example.com/repos/trunk

The first syntax lays out all three arguments explictly, naming each tree in the form URL@REV and naming the working copy target. The second syntax can be used as a shorthand for situations when you're comparing two different revisions of the same URL. The last syntax shows how the working-copy argument is optional; if omitted, it defaults to the current directory.

Best Practices for Merging

Tracking Merges Manually

Merging changes sounds simple enough, but in practice it can become a headache. The problem is that if you repeatedly merge changes from one branch to another, you might accidentally merge the same change twice. When this happens, sometimes things will work fine. When patching a file, Subversion typically notices if the file already has the change, and does nothing. But if the already-existing change has been modified in any way, you'll get a conflict.

Ideally, your version control system should prevent the double-application of changes to a branch. It should automatically remember which changes a branch has already received, and be able to list them for you. It should use this information to help automate merges as much as possible.

Unfortunately, Subversion is not such a system. Like CVS, Subversion does not yet record any information about merge operations. When you commit local modifications, the repository has no idea whether those changes came from running svk merge, or from just hand-editing the files.

What does this mean to you, the user? It means that until the day Subversion grows this feature, you'll have to track merge information yourself. The best place to do this is in the commit log-message. As demonstrated in the earlier example, it's recommended that your log-message mention a specific revision number (or range of revisions) that are being merged into your branch. Later on, you can run svk log to review which changes your branch already contains. This will allow you to carefully construct a subsequent svk merge command that won't be redundant with previously ported changes.

In the next section, we'll show some examples of this technique in action.

Previewing Merges

Because merging only results in local modifications, it's not usually a high-risk operation. If you get the merge wrong the first time, simply svk revert the changes and try again.

It's possible, however, that your working copy might already have local modifications. The changes applied by a merge will be mixed with your pre-existing ones, and running svk revert is no longer an option. The two sets of changes may be impossible to separate.

In cases like this, people take comfort in being able to predict or examine merges before they happen. One simple way to do that is to run svk diff with the same arguments you plan to pass to svk merge, as we already showed in our first example of merging. Another method of previewing is to pass the --dry-run option to the merge command:

$ svk merge --dry-run -r 343:344 http://svk.example.com/repos/calc/trunk
U  integer.c

$ svk status
#  nothing printed, working copy is still unchanged.

The --dry-run option doesn't actually apply any local changes to the working copy. It only shows status codes that would be printed in a real merge. It's useful for getting a “high level” preview of the potential merge, for those times when running svk diff gives too much detail.

Merge Conflicts

Just like the svk update command, svk merge applies changes to your working copy. And therefore it's also capable of creating conflicts. The conflicts produced by svk merge, however, are sometimes different, and this section explains those differences.

To begin with, assume that your working copy has no local edits. When you svk update to a particular revision, the changes sent by the server will always apply “cleanly” to your working copy. The server produces the delta by comparing two trees: a virtual snapshot of your working copy, and the revision tree you're interested in. Because the left-hand side of the comparison is exactly equal to what you already have, the delta is guaranteed to correctly convert your working copy into the right-hand tree.

But svk merge has no such guarantees and can be much more chaotic: the user can ask the server to compare any two trees at all, even ones that are unrelated to the working copy! This means there's large potential for human error. Users will sometimes compare the wrong two trees, creating a delta that doesn't apply cleanly. svk merge will do its best to apply as much of the delta as possible, but some parts may be impossible. Just like the Unix patch command sometimes complains about “failed hunks”, svk merge will complain about “skipped targets”:

$ svk merge -r 1288:1351 http://svk.example.com/repos/branch
U  foo.c
U  bar.c
Skipped missing target: 'baz.c'
U  glub.c
C  glorb.h

$

In the previous example it might be the case that baz.c exists in both snapshots of the branch being compared, and the resulting delta wants to change the file's contents, but the file doesn't exist in the working copy. Whatever the case, the “skipped” message means that the user is most likely comparing the wrong two trees; they're the classic sign of driver error. When this happens, it's easy to recursively revert all the changes created by the merge (svk revert --recursive), delete any unversioned files or directories left behind after the revert, and re-run svk merge with different arguments.

Also notice that the previous example shows a conflict happening on glorb.h. We already stated that the working copy has no local edits: how can a conflict possibly happen? Again, because the user can use svk merge to define and apply any old delta to the working copy, that delta may contain textual changes that don't cleanly apply to a working file, even if the file has no local modifications.

Another small difference between svk update and svk merge are the names of the full-text files created when a conflict happens. In the section called “Resolve Conflicts (Merging Others' Changes)”, we saw that an update produces files named filename.mine, filename.rOLDREV, and filename.rNEWREV. When svk merge produces a conflict, though, it creates three files named filename.working, filename.left, and filename.right. In this case, the terms “left” and “right” are describing which side of the double-tree comparison the file came from. In any case, these differing names will help you distinguish between conflicts that happened as a result of an update versus ones that happened as a result of a merge.

Noticing or Ignoring Ancestry

When conversing with a Subversion developer, you might very likely hear reference to the term ancestry. This word is used to describe the relationship between two objects in a repository: if they're related to each other, then one object is said to be an ancestor of the other.

For example, suppose you commit revision 100, which includes a change to a file foo.c. Then foo.c@99 is an “ancestor” of foo.c@100. On the other hand, suppose you commit the deletion of foo.c in revision 101, and then add a new file by the same name in revision 102. In this case, foo.c@99 and foo.c@102 may appear to be related (they have the same path), but in fact are completely different objects in the repository. They share no history or “ancestry”.

The reason for bringing this up is to point out an important difference between svk diff and svk merge. The former command ignores ancestry, while the latter command is quite sensitive to it. For example, if you asked svk diff to compare revisions 99 and 102 of foo.c, you would see line-based diffs; the diff command is blindly comparing two paths. But if you asked svk merge to compare the same two objects, it would notice that they're unrelated and first attempt to delete the old file, then add the new file; you would see a D foo.c followed by a A foo.c.

Most merges involve comparing trees that are ancestrally related to one another, and therefore svk merge defaults to this behavior. Occasionally, however, you may want the merge command to compare two unrelated trees. For example, you may have imported two source-code trees representing different vendor releases of a software project (see the section called “”). If you asked svk merge to compare the two trees, you'd see the entire first tree being deleted, followed by an add of the entire second tree!

In these situations, you'll want svk merge to do a path-based comparison only, ignoring any relations between files and directories. Add the --ignore-ancestry option to your merge command, and it will behave just like svk diff. (And conversely, the --notice-ancestry option will cause svk diff to behave like the merge command.)



[15] In the future, the Subversion project plans to use (or invent) an expanded patch format that describes tree-changes.