Exploiting Gource

The code

The code analyzed is taken from the HEAD commit on master on https://github.com/acaudwell/Gource (474870a3d7af80e109e3fd5ab328039c259e6821).

This was written while performing the review so don't expect everything to work flawlessly ;)

I start by looking for system() as it's often a reliable way to produce bugs. It turns out to be present under the wrapper systemCommand(). This wrapper is there to ensure compatibility with C++ strings. Where is this wrapper used?

# avoid the definition place

$ grep -r "systemCommand[^{]\+"
formats/bzr.cpp:    int command_rc = systemCommand(cmd_buff);
formats/svn.cpp:    int command_rc = systemCommand(cmd_buff);
formats/hg.cpp:    int command_rc = systemCommand(cmd_buff);
formats/git.cpp:    int command_rc = systemCommand(cmd_buff);
formats/git.cpp:        command_rc = systemCommand(cmd_buff);

Ok, so pretty much for every format. We'll start with git.

The function that uses this function is GitCommitLog::generatelog. Here is the relevant part (starting line 98):

char cmd_buff[2048];
snprintf(cmd_buff, 2048, "%s > %s", command.c_str(), temp_file.c_str());

int command_rc = systemCommand(cmd_buff);

if(command_rc != 0) {
    return 0;

// check for new-enough Git version
// if %aN does not appear to be supported try %an
std::ifstream in(temp_file.c_str());
char firstBytes[9];
in.read(firstBytes, 8);
firstBytes[8] = '\0';
if(!strcmp(firstBytes, "user:%aN")) {
    char *pos = strstr(cmd_buff, "%aN");
    pos[2] = 'n';
    command_rc = systemCommand(cmd_buff);

Ok, so we build a buffer with a command and a temporary file path. Then we execute it. If the temporary file does not begin by "user:%aN" we also get the benefit of a second execution of the same payload (with one byte changed if we have "%aN" in our command).

There are two problems that we will see in order here: arbitrary command execution through parameter injection and possible file overwrite.

Actually there is a third potential problem: a possible null pointer dereference if "%aN" isn't found in the command, but it turns out to be impossible.

Arbitrary code execution?

The most common vulnerability associated with the use of system() is the lack of proper escaping when building the command string.

Here the command is build in the function GitCommitLog::logCommand:

//git-log command notes:
// - no single quotes on WIN32 as system call treats them differently
// - 'user:' prefix allows us to quickly tell if the log is the wrong format
//   and try a different format (eg cvs-exp)

std::string GitCommitLog::logCommand() {

    std::string log_command = "git log "
    "--pretty=format:user:%aN%n%ct "
    "--reverse --raw --encoding=UTF-8 "

    if(!gGourceSettings.start_date.empty()) {
        log_command += " --since ";
        log_command += gGourceSettings.start_date;

    if(!gGourceSettings.stop_date.empty()) {
        log_command += " --until ";
        log_command += gGourceSettings.stop_date;

    if(!gGourceSettings.git_branch.empty()) {
        log_command += " ";
        log_command += gGourceSettings.git_branch;

    return log_command;

Dates are generally highly structured values, we won't lose time on them. The branch name is more interesting. Where does it come from? It turns out it can be provided with a command line flag. Here is the relevant code in gource_settings.cpp:

if((entry = gource_settings->getEntry("git-branch")) != 0) {

    if(!entry->hasValue()) conffile.missingValueException(entry);

    Regex branch_regex("^(?!-)[/\\w.,;_=+{}\\[\\]-]+$");

    std::string branch = entry->getString();

    if(branch_regex.match(branch)) {
        git_branch = branch;
    } else {

So we check the branch name with a regex. We cannot use space, quotes, a dollar sign, parentheses an exclamation mark or a backslash. This makes any shell injection very, very hard. So let's skip this one, we won't make it.

File overwrite

The other vulnerability is a file overwrite. Where? In the first snippet we saw.

char cmd_buff[2048];
snprintf(cmd_buff, 2048, "%s > %s", command.c_str(), temp_file.c_str());

Why is it problematic? Well, snprintf will write a maximum of 2048 bytes into the cmd_buff, but remember that we can control the branch name which will impact the length of the command. What if command takes 2036 bytes? With the redirection and the terminating null byte it will leave only 8 byte for the destination path no matter what its complete name is. So if the file path is /tmp/tmp.XXXXXX the command will in fact be truncated just before the dot so that the output file is /tmp/tmp.

This means that if the temporary file name has a deterministic beginning we can write the output of the command to a deterministic path.

So the question is: how are those temporary paths determined? This is done by the RCommitLog::createTempLog function (ignoring everything WIN32 related):

void RCommitLog::createTempLog() {

    std::string tempdir;

    tempdir = "/tmp/";

    char tmplate[1024];
    snprintf(tmplate, 1024, "%sgource-XXXXXX", tempdir.c_str());

    if(mkstemp(tmplate) < 0) return;

    temp_file = std::string(tmplate);

So do we have a deterministic beginning? We sure do! All paths will begin by /tmp/gource-. And note that /tmp is a directory that is notoriously open to all users in read and write access.

This can be used in many attacks although the most common is some form of privilege escalation.

Privilege escalation

Privilege escalations are an interesting topic because contrary to a situation in which you are a remote attacker you already have some leverage on the system.

Here we will consider an hypothetical case where a service allows the user to run gource on a project provided by the user through a dedicated interface that allows him to specify the branch of the project. The user is unprivileged but has access to the machine. The service is more privileged than the user.

As we saw, having leverage over the branch name means the user can force the output to a specific file. The user has access to the /tmp directory. He will specify a branch long enough to force the output to /tmp/go. It could be any name really, all we need is for it not to exist (or under our control).

Before launching Gource the user will create a symbolic link from /tmp/go to another file. For example ~/.ssh/authorized_keys. This file lists public keys that can access one's account through ssh. The link needs to be adapted to the home of the home of the user running the service but it is not a problem as /etc/passwd is publicly available.

If we run Gource at this point we will effectively write into the service's .ssh/authorized_keys. This will destroy all its keys (the principle can be used for DoS by overwritting important files) but we want more. We want our own key in that file.

What are we writting exactly? Due to the format imposed to git log (the command we're executing) it looks like this:

user:Andrew Caudwell
user:Andrew Caudwell
user:Andrew Caudwell
user:Andrew Caudwell
user:Andrew Caudwell
user:Andrew Caudwell
user:Andrew Caudwell
user:Andrew Caudwell

A bunch of user names and timestamps. How can we get our key in there? By commiting to a project with our key as username! That's why we need to have some leverage on the project being Gourced.

The authorized_keys file has a nice property: it ignores any line that doesn't begin by a key. So all we need to do is make sure the key is alone on its line. We do that by prepending and appending a carriage return to our key. This makes it a very long and weird line but who cares.

By launching the service with this project, our symbolic link in place and the proper branch name we will insert our public key into the service's authorized_keys. All we need to do next is using ssh to log in with its account and benefit of all its privileges.

Of course this can be adapted: we could have overwritten the bashrc, or the crontab to insert a shell command for example.

This is hard to pull off remotely because it requires some leverage but it is by no mean impossible to do.

The fix

The best way to fix is to check that you aren't writting more than you should. snprintf returns the number of bytes it would have written if it had not been stopped. A correct code could be:

char cmd_buff[2048];
int written = snprintf(cmd_buff, 2048,
                       "%s > %s", command.c_str(), temp_file.c_str());
if (written < 0 || written >= 2048)
    return 0;