Wednesday, December 23, 2009

svn needs IDE integration

It's funny; all the world craves for svn integration for their IDE and doesn't want git because of lack of such integration. On the other hand I am quite happy without the integration, but then I was raised on a command line.

And then I noticed that, for the usual IDEs (java dev, that is), you need IDE integration for svn for a very simple reason: refactoring. If you rename or repackage a class, the source file changes location, and svn needs to record that as a file rename, and you need to tell it so. Now, if the IDE has moved the file, you can't do again with svn, and that would be the only way to make svn to know the move. (You could stop the IE, move the file back and then with svn move it forth again, but...shudder.)

Thus the IDE must be able to talk directly with svn. It's not just a question of avoiding a suboptimal command line, it's a necessity. With git, on the other hand, there is no such need as it does not directly track renames anyway and is much smarter about them and merging.

Sunday, November 15, 2009

svn: more bad

Ranting about tortoisesvn and it committing new files with CR/LF in them, I finally found that there is apparently no way to make svn behave the right way (in our sense) of doing svn:eol-style=native automatically. You need to do a setup like this. Yet another misdesign. (And if you work on two projects where these defaults differ, then you are out of luck.)

By the way, somehow nobody did think that a by-line diff of svn:externals would be a good thing. We usually have one directory with six or so externals, and when you change one of them svn diff just shows the whole property text as changed, and not individual lines. Tortoisesvn is smarter here (as in other places), reinforcing my claim that with svn you need GUI support because the command line interface isn't exactly pretty.

Sunday, November 08, 2009

svn: BAD

svn is just broken.

Google for git commit. First link points to the current documentation. Dito for svn commit. First hits point to outdated versions of the manual (usually 1.0, maximum 1.5).

Checkout a module. Unfortunately, the URL to check out usually ends on trunk so the default sandbox directory name isn't really helpful. Likewise, it took 'til 1.6 that they found out that typing the repository URL in many commands is a waste, and allowed ^/repo-relative paths not only in svn:externals but also in commands.

Just for kicks try an svn ls http://... operation over GPRS. Will take on the order of half a minute because GPRS round trip time is about a second, and svn ls does a good dozen http requests to the server, all serialized, but not reusing the connection. It is actually a good idea to prefix commands with ssh elsewhere; the ssh login is faster, and the svn command is then done on fast links. (Doesn't work for commands involving a sandbox, of course.)

svn import can only import a given tree into a new location within the svn repo; it can't do updates, like they are needed to import newer versions of vendor software and keep the file identity that will be needed to keep future merges smooth. There is a script in contrib that can do that, but it never got integrated into the import command proper.

Likewise, don't even expect a command that can create an archive (tar/zip) from the versioned files in the current directory (git archive), or that can delete all unversioned files in the sandbox (git clean). Or that tells you a human-readable name for the current revision based on the last tag (git describe).

And, of course, there is the big thing: the hyped merge support of svn 1.5 is broken by design. Isn't fixed in 1.6, and probably can't be fixed without changing the repository format. (I suspect the whole mergeinfo property stuff is just wrong.) In the meanwhile, and in a much shorter timespan than from the announcement of svm merge support to its eventual functional delivery, git appeared, and just did it right.

I don't like that. We've been waiting for svn 1.5 because of the merge support, and then started to switch. At the same time git appeared on my radar; I could use git cvsimport and git filter-branch to do our conversion from cvs to svn, in a way that allows to pull future changes from cvs to svn as well, and without manual intervention. Unfortunately I lost all liking for svn in the process, because git turned out to be just much better tooled, capable, and flexible. And faster in its development, its version number is bound to pass that of svn soon.

Thursday, October 22, 2009

The awesomeness of no rename

git does not store renames. All it does is store each version of the whole tree of the managed project. The diff and merge tools are those that look at the trees and file contents and notice that a file has been renamed.

For example, I had a project where a program was converted to C++, piecewise. One file was originally foo.c, then foo.cpp was created by copying and fixing the original C source. foo.c was not removed until later, so for a few commits both existed simultaneously. Indeed this was in CVS, and I just imported the stuff to git, to work on newer stuff there.

Now it was time to merge the line originating in CVS into the work branch. git merge just looked at the changes that needed to be merged all at once, and saw that foo.c was gone and foo.cpp was new over that stretch, and that they were 95% similar, so it assumed this was a rename, and properly merged the rename into the work branch.

That is the power of not doing or saving renames.

Monday, September 28, 2009

git and cvs: Coexistence

There is a relatively trivial way to use git on trees managed by cvs which I use often to carry around cvs projects for new work and enjoy the distributed nature (which comes quite handy when on a train). Recipe: Check out the target tree with cvs, go into the root, git init, and cvs-files | xargs git add. One git commit -m initial and you are ready to go.

Now you can clone around as you like, and do cvs updates in the gitted sandbox, and commit those into git, and back. Only thing is that on the way back to cvs you manually need to track file additions/deletions.

The script cvs-files looks like:

find * -name CVS -type d | while read cdir
dir=`dirname $cdir`
if test X$dir = X. ; then
sed -ne 's:^/\([^/][^/]*\)/.*$:\1:p' /dev/null ${dir}CVS/Entries* | \
sort -u | while read name
if test -r $dir$name; then
echo $dir$name

It just goes through the CVS/Entries files to find out what's under CVS control to initially put those into git.

Tuesday, April 07, 2009

There isn't enough coffee in the universe

EVS is me going "Screw it, I'm not playing any more" and writing a
system that can talk to anything, given enough time and coffee.

And this mail came after a loooong day filled with nothing actually, unless you count trying in vain to get some GeForce running with openSuSE as work. (Which would have been fun if the driver would not only support portrait mode but also draw correctly and use the panel's natural resolution. eclipse look quite good on a portrait screen.)

Anyway, how can one get the idea that all version control systems are basically created equal, and that that is enough to write an universal server that can talk each protocol equally and fully? Even with cvs you can trivially create a repository than can't be converted to, say, git, and vice versa create a repository in git that can't be represented in cvs. In the latter case the revisions would all be there, but the merge information would be missing.

If you take out all the edge cases, that is, sufficiently many of them, then it may be made to work. But neither would many existing repositories match those requirement, nor would the resulting functionality be very interesting.

Wednesday, February 25, 2009

MacOS surprise: ssh-agent

I'm new on my macbook. Anyway, since leopard ssh asks for passphrases using a dialog instead of on the command line. This is not necessarily a bad idea, for example a git push from within git gui has no good place to ask.

On the other hand, using ssh-agent for that is even more convenient, and since this dialog allows you to store the password on the keyring I assumed that that is the preferred way and no one would bother to start an ssh agent under the whole of the leopard gui session. Turn out I was wrong. And my workaround of using xterms started from another xterm with an agent in there was quite unnecessary.

Thursday, January 01, 2009

ant: BAD

Who doesn't know history is condemned to repeat it. Or not even that, as ant managed. In my option, ant started with a bad idea and then made it worse. In other words, it is Broken As Designed.

The quotes are form the aforementioned page, section 'Apache Ant'.

Apache Ant is a Java-based build tool. In theory, it is kind of like Make, but without Make's wrinkles.

Yeah, the wrinkles have been replaced by pointy brackets, and all the deep problems haven't even been addressed at all.

Why another build tool when there is already make, gnumake, nmake, jam, and others? Because all those tools have limitations that Ant's original author couldn't live with when developing software across multiple platforms.

Well, at least I don't want to live with ant. As little as possible. Granted, make isn't usable for building java projects either, but then neither is ant, without working around the buggy javac task.

Make-like tools are inherently shell-based -- they evaluate a set of dependencies, then execute commands not unlike what you would issue in a shell.

That's the point! Why invent another scripting language when you can have one for free? Instead ant goes along inventing a new scripting language (which may not even be turing complete), and it takes them years to come to a point where ant can at least do what the good old shell always could.

And still, to do a simple grep Whatever | sed -e s/XX/YY/ you need to write java programs, compile them, feed the jars to ant, and after a few hour get it actually working. Not to menting the fact that those classes can't easily be part of your project because you need them before starting ant. (Ok, that's not actually true, but in an ugly way.)

This means that you can easily extend these tools by using or writing any program for the OS that you are working on. However, this also means that you limit yourself to the OS, or at least the OS type such as Unix, that you are working on.

Gladly so. Nowadays you can easily get VMs oder cygwin if you should really need to work elsewhere.

At least it's much better than making everything complicated, everywhere. The average trivial build.xml contains about 60 lines, of which three do actually vary with the project, and the rest is always the same. Don't repeat yourself, huh? At least we hope it's the same, and there aren't any copy&paste bugs lurking.

Makefiles are inherently evil as well. Anybody who has worked on them for any time has run into the dreaded tab problem. "Is my command not executing because I have a space in front of my tab!!!" said the original author of Ant way too many times.

The original author of ant obviously had the wrong editor, or was missing a little script to check for that. I've written my share of makefiles, and I can't remember ever running into that problem.

If only the original author of ant ran into the real problems that make poses; then ant wouldn't have been such a terrible misdesign.

Ant is different. Instead of a model where it is extended with shell-based commands, Ant is extended using Java classes. Instead of writing shell commands, the configuration files are XML-based, calling out a target tree where various tasks get executed. Each task is run by an object that implements a particular Task interface.

So, instead of writing little one-liners I need to write code in a programming language that itself is known as verbose, write additional build targets to get those compiled, and finally knit it all together in XML, yet another source of bloat to the whole end. (And for the nitpickers, ant uses two extra syntactic elements: comma-separated lists and property value replacements. Guess why they don't do that as XML elements.) ant has a serious Dr. No syndrome.

Granted, this removes some of the expressive power that is inherent by being able to construct a shell command such as `find . -name foo -exec rm {}`, but it gives you the ability to be cross platform -- to work anywhere and everywhere.

...and making that a lot of work. Ant really does not fall into the category 'make the easy things easy and the hard things possible'.

And hey, if you really need to execute a shell command, Ant has an <exec> task that allows different commands to be executed based on the OS that it is executing on.

Now that't a cop-out. If our uber-tool happens to not support what we want we can still go platform-specific. Thanks; due to general disgust with XML I wrote a little shell script (of all things) that does everything I need to do in the projects I need to manage, including compiling and collecting used libraries of my own. /bin/sh isn't exactly the best way to do, but the whole thing is just five times larger than the (broken) build.xml I got for those projects (and shorter than this blog entry), and the actual build script for one of those condenses to

. "`dirname "$0"`"/../proj-a/tools/
depdir ../proj-a
mkjar proj-b

and is also a shell script.

Now what?

Ok, more details on how ant is misdesigned. The javac task is broken:
It only compiles java files that have no corresponding class file or
whose class file is older than the java file. It does not erase class
files for which there is no longer a source file, nor does it
recompile dependencies between the classes. Especially the latter
makes for interesting bugs. Workaround: Always clean before compiling,
either manually or by the 'dependencies'.

And these 'dependencies'. They aren't, really. The individual tasks sometimes execute conditionally depending on whether the destination is newer than the source, but you need to state dependencies explicitly, no matter whether ant could deduce the dependency itself. For example, when a javac task produces what a jar task consumes ant won't make them dependent automatically. Otherwise, when you put a target into the dependencies list of another, it is executed, no matter what. As opposed to the following make fragment

prog : main.c gen.c
cc -o prog main.c gen.c
tool : tool.c
cc -o tool tool.c
gen.c : tool
./tool gen.c

which tells that gen.c needs to be generated by tool which in turn needs to be compiled from tool.c. The point here is that when you call make, the tool won't be recompiled unless you modified its source, and gen.c won't be generated unless you modified either the generator or its input file.

Ant stays blissfully unaware of any of this dependency management, and thusly degenerates to the fixed execution of a number of scripts (called targets) with a number of commands (called tasks) each. If you want your task not to do work when none is needed, don't expect support from ant. Ant is not really a build tool, it is a simple script executor, however their creators talk about declarative operations and the ant way which seems pretty long-winded to me.

The 'declarative way' does not keep people from relying on the fact that the dependencies of a target are executed in the order they are specified. A dependencies="clean, compile" to work around the javac problem is all but uncommon, and clearly will break when ant decides to run the dependencies in inverse oerder.

To be fair, the dependency problem isn't exactly trivial, especially without support from the actual compiler. make doesn't do a good job for java, either. But on the other hand side we'd expect an industrial-strength build system to have invested not just a little thought?

Then ant completely ignored the lesson of the X windowing system. Those guys actually improved on make by using the C preprocessor. Their Imakefile system inspired another (proprietary) system that could just say

CProgramFromSources (prog) {
CSource (gen)
CSource (main)
CProgram (tool)
gen.c : tool
./tool gen.c

with one important difference: The macros expand so that not only make prog does the expected thing, but also that make clean removed all the temporaries, except for gen.c which was done with a plain make rule. We need to add

clean ::
rm -f gen.c

to make that work, too. This also shows another overlooked make feature: You can actually combine a target from multiple separate commands and dependencies. It's just not possible to have multiple clean targets in ant, making it even more error-prone to do the right cleanout.

Ant simply aims too low by one or two levels of abstraction.

Then what?

Unfortunately ant has gotten a lot of traction in the java community. Proves again that you need to be the first, not the best. Compounded by the fact that most programmers don't care about the build system any more than needed to make it apparently work. And everything and the kitchen sink is available as an ant task, so it's not just programmers to turn around.

And I'm not exactly in a position to get a lot of traction for a change. The most promising way is to fight the system from within; perhaps by actually having some preprocessor (again) generating a tmp/build.xml to then be included.

The most depressing thing is that ant will make the majority of people think that this is the state of the art in build system design. Far from it.