Extracting an Archive Into Jazz SCM Using the Plain Java Client Libraries


How can one share or extract data from an archive file directly into Jazz SCM system? This is just one question that has a lot of potential. If one could do that, any source for information could also be used to extract data and store it in Jazz SCM. Basically it would be possible to write your own SCM migration tool.

This post shows some of the Client API used to do this kind of operations.

Evan’s SCM Lounge blog post Getting your stuff – using the RTC SDK to zip a repository workspace explains the reverse direction, in case you are interested. His post Committing content to RTC SCM with the SDK is also a great explanation of the Jazz SCM API.

As described in the articles

  1. Deploying Templates and Creating Projects using the Plain Java Clients Library
  2. Managing Workspaces, Streams and Components using the Plain Java Client Libraries
  3. Delivering Change Sets and Baselines to a Stream Using the Plain Java Client Libraries
  4. This post

I want to automatically set up a project, manage the streams and components and seed the SCM system with SCM data.

The last post explains how the SCM operations on the repository workspace and stream work. Specifically, how to deliver outgoing change sets from a repository workspace to a stream, baseline the state of a component in the repository workspace, deliver the baseline to the stream and set a component of the repository workspace to a specific baseline.

This post explains how to

  • Extract data from an archive file
  • Find or create files and folders in the Jazz SCM system
  • Compare the content of files in the Jazz SCM with content in another source
  • Check in/commit changes to files and folders into the Jazz SCM

License and how to get started with the RTC API’S

As always, our lawyers reminded me to state that the code in this post is derived from examples from Jazz.net as well as the RTC SDK. The usage of code from that example source code is governed by this license. Therefore this code is governed by this license, which basically means you can use it for internal usage, but not sell. Please also remember, as stated in the disclaimer, that this code comes with the usual lack of promise or guarantee.

If you just get started with extending Rational Team Concert, or create API based automation, start with the post Learning To Fly: Getting Started with the RTC Java API’s and follow the linked resources.

You should be able to use the following code in this environment and get your own automation or extension working.

To keep it simple this example is, as many others in this blog, based on the Jazz Team Wiki entry on Programmatic Work Item Creation and the Plain Java Client Library Snippets. The example in this blog shows RTC Client API.

The ArchiveToSCMExtractor Class

The extraction process is wrapped into the class ArchiveToSCMExtractor. The code below shows the imports, fields and the basic constructor. The Archive file is expected to be in a format that is created when compressing a folder or exporting projects from an Eclipse client as archive files: folders at root level that contain files and folders.

/*******************************************************************************
 * Licensed Materials - Property of IBM
 * (c) Copyright IBM Corporation 2013. All Rights Reserved. 
 * 
 * ArchiveToSCMExtractor
 *
 * Note to U.S. Government Users Restricted Rights:  Use, duplication or 
 * disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
 *******************************************************************************/
package com.ibm.js.rtcext.serversetup;

import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Date;
import java.util.List;
import java.util.Map;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;

import org.eclipse.core.runtime.IProgressMonitor;
import org.eclipse.core.runtime.NullProgressMonitor;

import com.ibm.team.filesystem.client.FileSystemCore;
import com.ibm.team.filesystem.client.IFileContentManager;
import com.ibm.team.filesystem.common.FileLineDelimiter;
import com.ibm.team.filesystem.common.IFileContent;
import com.ibm.team.filesystem.common.IFileItem;
import com.ibm.team.repository.client.ITeamRepository;
import com.ibm.team.repository.common.TeamRepositoryException;
import com.ibm.team.scm.client.IConfiguration;
import com.ibm.team.scm.client.IWorkspaceConnection;
import com.ibm.team.scm.client.content.util.VersionedContentManagerByteArrayInputStreamPovider;
import com.ibm.team.scm.common.IChangeSetHandle;
import com.ibm.team.scm.common.IComponentHandle;
import com.ibm.team.scm.common.IFolder;
import com.ibm.team.scm.common.IFolderHandle;
import com.ibm.team.scm.common.IVersionable;
import com.ibm.team.scm.common.IVersionableHandle;

/**
 * Extracts data from a compressed archive file into the RTC Jazz SCM.
 * The data is extracted directly into the component, provided in the call.
 * The assumption is, that the data in the archive is inside of one or more 
 * folders, which can be loaded with the component.
 * 
 * @see com.ibm.team.repository.service.tests.migration.ZipUtils
 * 
 */
public class ArchiveToSCMExtractor {

	// The ZipInputStream
	private ZipInputStream fZipInStream = null;
	// The team repository
	private ITeamRepository fTeamRepository = null;
	// The workspace connection
	private IWorkspaceConnection fWorkspace = null;
	// The change set
	private IChangeSetHandle fChangeSet = null;
	// The configuration to access the SCM data
	private IConfiguration fConfiguration=null;
	// The progress monitor we are using
	private IProgressMonitor fMonitor=new NullProgressMonitor();
	/**
	 * Simple constructor
	 * 
	 */
	public ArchiveToSCMExtractor() {
		super();
	}

Some of the data used during the extraction process is kept in fields. This makes the method calls simple. The constructor in this example is really simple. This could be enhanced if you have other needs.

The required data is passed in the public method extractFileToComponent(), which is the entry point, available to start the extraction process. This method is described below.

The method extractFileToComponent() requires the archive file name, the ITeamRepository, the workspace connection representing the repository workspace, the component that the data will be sent to, a comment for the change set that will contain the changes and the monitor.

Note, the ITeamRepository could be gathered from the component using getOrigin() or from the workspace connection getting the owner and then getting its origin.

The method stores the information passed in the fields, so that other methods can consume the data. It creates a new change set that is used to contain all changes created during the process. Then it creates a new ZipInputStream for the file represented by the archive. If all succeeds it starts the extraction process by calling the method extract() and returns its value.

/**
 * Extract the archive. This is basically the entry point 
 * for the extraction.
 * 
 * @param zipFile
 * @param teamRepository
 * @param targetWorkspace
 * @param component
 * @param changeSetComment
 * @param monitor
 * @return
 * @throws Exception
 */
public boolean extractFileToComponent(String archiveFileName,
		ITeamRepository teamRepository, IWorkspaceConnection targetWorkspace,
		IComponentHandle component, String changeSetComment,
		IProgressMonitor monitor) throws Exception {

	fMonitor=monitor;
	fTeamRepository = teamRepository;
	fWorkspace = targetWorkspace;
	fChangeSet = fWorkspace.createChangeSet(component, changeSetComment,
			true, monitor);
	fConfiguration = fWorkspace.configuration(component);
	File archiveFile = new File(archiveFileName);
	System.out.println("Extract: " + archiveFile.getPath());
	try {
		FileInputStream fileInputStream = new FileInputStream(archiveFile);
		fZipInStream = new ZipInputStream(fileInputStream);
		try {
			return extract();
		} finally {
			fileInputStream.close();
		}
	} catch (Exception e) {
		System.out.println("Extract Exception" + e.getMessage());
		e.printStackTrace();
		return false;
	}
}

The code for extract() is presented below. The method basically iterates all entries in the ZipInputStream.

For each of the entries it checks if this is a directory. If this is the case it uses the method findOrCreateFolderWithParents() to find the directory folder in the Jazz SCM system which also creates the desired folder, including missing parent folders that don’t yet exist.

In case the entry represents a file the work of extracting the file is delegated to the method extractFile().

Please note that archive files don’t necessarily contain all directory folders, some only store the files and the folder structure needs to be recreated from the file path. This method should handle also archives that provide folder information, even if the folders are empty.

/**
 * Extract the content of an archive to a component.
 * This assumes that the archive contains folders on top level. 
 * These folders can act as projects when loading,
 * 
 * @return
 * @throws IOException
 * @throws TeamRepositoryException
 */
private boolean extract() throws IOException,
		TeamRepositoryException {
	ZipEntry entry = fZipInStream.getNextEntry();
	boolean result = true;
	while (entry != null) {
		File targetEntry = new File(entry.toString());
		try {
			if (entry.isDirectory()) {
				System.out.println("Extracting Folder: "
						+ targetEntry.getPath());
				findOrCreateFolderWithParents(targetEntry);
			} else {
				System.out.print("Extracting File: "
						+ targetEntry.getPath());
				extractFile(targetEntry, entry);
				System.out.println(" OK");
			}
		} catch (Exception e) {
			System.out.println("Extract Exception: " + e.getMessage());
			e.printStackTrace();
		result = false;
		} finally {
			fZipInStream.closeEntry();
		}
		entry = fZipInStream.getNextEntry();
	}
	return result;
}

The method extractFile() below works as follows. First, it tries to find the parent folder for the file in the Jazz SCM system, using findOrCreateFolderWithParents().

The method findOrCreateFolderWithParents() tries to find the parent folder of the file in the Jazz SCM system and creates the folder and all its parents, if necessary. The IFolder returned represents the directory folder object in the SCM system, that is supposed to contain the file.

Now the method uses getFile() to try to find the file represented as a IFileItem object  entry contained in the parent IFolder in the Jazz SCM system. If the file does not exists the method returns null and createFileItem() is used to create the IFileItem in the parent IFolder. Now the IFileItem should be available and we can access its content, regardless if it already existed or was just created.

In the next step the file content is copied from the zip entry into a ByteArrayOutputStream. This step is necessary, because the VersionedContentManagerByteArrayInputStreamPovider closes the stream provided with the data. This is usually not a problem, but in our case we need the ZipInputStream to stay open to process the next entries. It might be possible to write your own VersionedContentManagerStreamProvider that does not close the file. I tried to do that but had issues with certain files. I am not sure what the reason was, maybe some issue with converting the content. So I decided to use an existing provider instead.

The next step is to get a IFileContentManager that is needed to compare the content of the file we just copied with the content in the stream. This interface is used to create IFileContent that contains the data of the ZipEntry. This process requires to provide the file encoding as well as the line delimiter used in the content. In our case we want to only handle text files and can pick the right values easily. In more complex scenarios with files with different encoding and file delimiters, it would be necessary to determine the values somehow. One strategy would be using the file extension. The VersionedContentManagerByteArrayInputStreamPovider is used to convert the data we copied before and provide it for storage.

The next step is to compare the content of the file in the Jazz SCM and the content from the archive file we just found or created. If the contents are different, it is necessary to modify the object in the Jazz SCM system.

In case the file existed before it is necessary to get a working copy of the file to do so. After retrieving the working copy the content is set to the new value and the change is committed to the workspace. If a new file was created, there was no content. The new content is always set and therefore the file creation is also committed.

After committing the change, or doing nothing, if there was no change, this method is done.

/**
 * Extract a file from a ZipEntry to the Jazz SCM system. Currently only
 * Text/UTL8 files are supported. 
 * 
 * Commit the file if there are changes because
 * the file did not exist or there are changes in the new content compared
 * to the existing file.
 * 
 * @param targetFile
 * @param zipEntry
 * @throws FileNotFoundException
 * @throws IOException
 * @throws TeamRepositoryException
 * @throws InterruptedException
 */
private void extractFile(File targetFile, ZipEntry zipEntry)
		throws FileNotFoundException, IOException, TeamRepositoryException,
		InterruptedException {
	IFolder parentFolder = findOrCreateFolderWithParents(
			targetFile.getParentFile());

	IFileItem aFile = getFile(targetFile, parentFolder);
	if (aFile == null) {
		aFile = createFileItem(targetFile.getName(), zipEntry, parentFolder);
		System.out.print(" ... Created");
	}

	ByteArrayOutputStream contents = copyFileData(fZipInStream);
	try {
		IFileContentManager contentManager = FileSystemCore
				.getContentManager(fTeamRepository);
		IFileContent storedzipContent = contentManager.storeContent(
				IFileContent.ENCODING_UTF_8,
				FileLineDelimiter.LINE_DELIMITER_PLATFORM,
				new VersionedContentManagerByteArrayInputStreamPovider(
						contents.toByteArray()), null, fMonitor);

		// Compare the files. If there is a difference, set the new 
		// content and commit the change
		if (!storedzipContent.sameContent(aFile.getContent())) 
		{
			IFileItem fileWorkingCopy = (IFileItem) aFile.getWorkingCopy();
			fileWorkingCopy.setContent(storedzipContent);
			fWorkspace.commit(fChangeSet, Collections
					.singletonList(fWorkspace.configurationOpFactory()
							.save(fileWorkingCopy)), fMonitor);
			System.out.print(" ... Content");
		}
	} finally {
		contents.close();
	}
}

How are folders found and created? This is dealt with in the method findOrCreateFolderWithParents().

If there is no parent, the folder is in the root and the method can return the completeRootFolder() of the component. Otherwise the method tries to recursively discover the parent of the current folder. Once a valid parent is found, the method uses getFolder() to find the IFolder relative to the parent found. If none can be found, a new folder is created with createFolder(). The folder is then returned.

The recursive call allows to find a folder, the root parent folder of the component , and then find the other folders in the hierarchy. The recursive descent simply makes iterating the path easier to use. Another approach would be to split the path and iterate the segments and find the next deeper level beginning with the root. Remembering a stack of folders would be another optimization option. Evan’s post Committing content to RTC SCM with the SDK shows another way to resolve a path, that I had overlooked, using IConfiguration.resolvePath() might be an option too.

/**
 * Find a folder in the Jazz SCM system. Create the folder if required. Also
 * finds and, if necessary, creates the required parent folders. Could be
 * optimized by keeping the folder stack.
 * 
 * @param folder
 * @return
 * @throws TeamRepositoryException
 */
private IFolder findOrCreateFolderWithParents(File folder)
		throws TeamRepositoryException {

	IFolder parent = null;
	String folderName = folder.getName();
	String parentName = folder.getParent();
	if (parentName == null) {
		parent = fConfiguration.completeRootFolder(fMonitor);
	} else {
		// Recursively find the parent folders
		parent = findOrCreateFolderWithParents(new File(parentName));
	}
	IFolder found = getFolder(folderName, parent);
	if (found == null) {
		found = createFolder(folderName, parent);
	}
	return found;
}

The method getFolder() is used to check if a folder with a specific name has an entry in the found parent folder. The method basically uses getVersionable() to get a IVersionable item with the name and the parent. It checks if the item is an IFolder and returns the item if that is the case or null, if not.

/**
 * Find a folder in an existing parent folder.
 * 
 * @param folderName
 * @param parentFolder
 * @return
 * @throws TeamRepositoryException
 */
@SuppressWarnings("unchecked")
private IFolder getFolder(String folderName, IFolderHandle parentFolder)
		throws TeamRepositoryException {

	IVersionable foundItem = getVersionable(folderName, parentFolder);
	if(null!=foundItem){
		if (foundItem instanceof IFolder) {
			return (IFolder) foundItem;
		}
	}
	return null;
}

The method getVersionable() basically gets all the child entries of the parent first. The returned map has the names of the entries as well as the IVersionableHandles.  The method tries to get the handle of an item using the name. If there is a handle, it gets the complete item and then returns this item. Otherwise null is returned to show that no qualified element was found.

/**
 * Gets a versionable with a specific name from a parent folder.
 * 
 * @param name
 * @param parentFolder
 * @return
 * @throws TeamRepositoryException
 */
private IVersionable getVersionable(String name, IFolderHandle parentFolder)
		throws TeamRepositoryException {
	// get all the child entries
	@SuppressWarnings("unchecked")
	Map<String, IVersionableHandle> handles = fConfiguration.childEntries(
			parentFolder, fMonitor);
	// try to find an entry with the name
	IVersionableHandle foundHandle = handles.get(name);
	if(null!=foundHandle){
		return fConfiguration.fetchCompleteItem(foundHandle, fMonitor);
	}
	return null;
}

If no folder with the correct name can be found, a new folder must be created. This is basically done in the code of the method createFolder() shown below. The method basically creates a new IFolder, sets the parent folder and the name and then commits it to the change set into the Jazz SCM.

/**
 * Create a folder and commit it to SCM.
 * 
 * @param folderName
 * @param parent
 * @return
 * @throws TeamRepositoryException
 */
private IFolder createFolder(String folderName, IFolder parent)
		throws TeamRepositoryException {
	IFolder newFolder = (IFolder) IFolder.ITEM_TYPE.createItem();
	newFolder.setParent(parent);
	newFolder.setName(folderName);
	fWorkspace.commit(fChangeSet, Collections.singletonList(fWorkspace
			.configurationOpFactory().save(newFolder)), fMonitor);
	return newFolder;
}

Similar to finding the parent folder, it is necessary to find existing files in extractFile(). The method getFile() does this analogue to the method getFolder() above, again using getVersionable().

/**
 * Tries to find a IFileItem node in a given IFolder. Returns the IFileItem
 * found or null if none was found.
 * 
 * @param file
 * @param parentFolder
 * @return
 * @throws TeamRepositoryException
 */
private IFileItem getFile(File file, IFolderHandle parentFolder)
		throws TeamRepositoryException {
	IVersionable foundItem = getVersionable(file.getName(), parentFolder);
	if(null!=foundItem){
		if (foundItem instanceof IFileItem) {
			return (IFileItem) foundItem;
		}
	}
	return null;
}

Again, if a matching file can not be found in the SCM system, it is necessary to create one. This is done in createFileItem(). This works similar to the method createFolder() above. A new Item is created and the necessary properties are set. Other than in createFolder() the new item is not committed to Jazz SCM, because this is done after setting the file content. If you try to commit the file without setting the content, the operation would fail.

/**
 * Tries to create a IFileItem node in a given IFolder. Returns the
 * IFileItem.
 * 
 * @param string
 * @param zipEntry
 * @param parentFolder
 * 
 * @return
 * @throws TeamRepositoryException 
 */
private IFileItem createFileItem(String name, ZipEntry zipEntry,
		IFolder parentFolder) throws TeamRepositoryException {
	IFileItem aFile = (IFileItem) IFileItem.ITEM_TYPE.createItem();
	aFile.setParent(parentFolder);
	aFile.setName(name);
	aFile.setContentType(IFileItem.CONTENT_TYPE_TEXT);
	aFile.setFileTimestamp(new Date(zipEntry.getTime()));
	return aFile;
}

The last thing missing is the method to copy the file data so that the content can be created for the SCM compare and store operations. The method copyFileData() below does just this. It copies the data from one stream to another. This prevents our ZipInputStream from being closed as well as provides us with the interface needed to get the data for storing it.

/**
 * Copy the data from an input stream to an output stream. This is done to
 * avoid the Jazz SCM closing the stream that contains the original data.
 * 
 * @param zipInStream
 * @return
 * @throws IOException
 */
private ByteArrayOutputStream copyFileData(InputStream zipInStream)
		throws IOException {
	ByteArrayOutputStream contents = new ByteArrayOutputStream();
	byte[] buf = new byte[2048];
	int read;
	while ((read = zipInStream.read(buf)) != -1) {
		contents.write(buf, 0, read);
	}
	contents.flush();
	return contents;
}

A closing bracket finalizes the class.

The class is used in the method exctractToComponentBaseline() from the post Delivering Change Sets and Baselines to a Stream Using the Plain Java Client Libraries as shown below.

	// Extract the archive file to the component in the repository
	// workspace
	System.out.println("Extracting...");
	ArchiveToSCMExtractor extract = new ArchiveToSCMExtractor();
	if(!extract.extractFileToComponent(archiveFileName, teamRepository,
			repoWorkspace, component, changeSetComment, monitor)){
		throw new Exception("Exception extracting " + archiveFileName);
	}

The code above could be enhanced by managing a stack of the folders that have already been found, because the archive files usually do a recursive descent through the file system, this would reduce the number of lookups for the parent folder considerably. The code to manage this would be too complex for the post and for the scenario this code was designed, the performance is acceptable,

Summary

The code in this post basically shows how to work with data against repository workspaces. In the example the data is taken from an archive using ZipInputStream. However, this is just the example and the data could come from a file system, some application, like another SCM system or anywhere else.

The posts Managing Workspaces, Streams and Components Using the Plain Java Client Libraries and Delivering Change Sets and Baselines to a Stream Using the Plain Java Client Libraries basically provide you with all you need to know to write your own migration tooling, too.

As always, I hope that sharing this code helps users out there, with a need to use the API’s to do their work more efficient.

8 thoughts on “Extracting an Archive Into Jazz SCM Using the Plain Java Client Libraries

  1. Excellent Article, Ralph.

    Do you have any suggestion to do the opposite way: extract to a zip the contents of a workspace programmatically?

  2. Hi, Ralph. . super, super useful, as usual. One question: is the reason that this is limited to text files because of the delimiter and to keep the demo simple? Or is the content manager limited in some other respect?

    • Ah, nm.. I think I got it using IFileItem.CONTENT_TYPE_UNKNOWN. Thanks again for all the knowledge you share with the community!

      • Andy, I basically create examples from what I experience doing my work. I don’t necessarily cover all use cases always. Thanks for sharing what you found.

  3. I am not able to upload pdf files,is it a code limitation,or SCM limitation?
    If possible update the solution to upload PDF files to SCM.

      • It is important to choose the right combination of line delimiter, content type etc…. com.ibm.team.filesystem.common.FileLineDelimiter, com.ibm.team.filesystem.common.IFileContent

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.