EWM REST API to Access Existing Work Item Queries


After looking into the EWM OSLC CM API, including the OSLC Query API, a natural question that comes up is: Is it possible to take advantage of the existing EWM work item queries? Is it possible to use that query mechanism and run existing work item queries to get the result set back?

I had to look into that for a customer and an internal project and where experiencing some difficulties. So I think it is valuable to share my experiences and examples that finally worked for me.

Context of the blog post is the series

This is the series of planned posts I intent to publish over time. Most of the examples will be EWM based, but quite a lot of the content applies to more ELM applications.

External Links

I used at least the following links for exploring this mechanism.

EWM Stored Query API Introduction

EWM provides its own REST API that supports finding and running predefined, shared or personal work item queries that are stored in the repository. The API allows to query for all the stored work item queries available. The API allows to run such a query and to access the result information to present the work items and their data. This API is EWM only as far as I can tell. It is not documented to be available for any of the other products. It is also important to note, this API is not an OSLC API, even if the query URL contains oslc as part of the path.

The examples for how to use built in query are nice to have, but I found the documentation was lacking essential information. Maybe it is jut me and it comes naturally to all the native cloud wizz kids. I struggled to get the examples to work. One pattern I have now seen several times in documentation and examples is a lack of information about required and supported headers and how to URL encode query parameters. I will try to explain what worked for me. The documentation has recently been updated and now contains more details compared to when I first explored it.

For this specific part of the API, my observation was that getting the query parameters wrong, often resulted in the query returning results, but not the expected results. Usually the result set way way bigger than expected in those cases.

One important behavior that is very sensitive to the Accept header is how the information is returned and presented in the responses. There are two completely different pattern.

  • Only the URLs to the results are available in the response. To get any details about the result referenced by the URL, it is necessary to query that URL. This corresponds to the situation for OSLC queries without or an empty oslc.select statement.
  • Additional information is provided inline in the response. It is not necessary to query the result URI to be able to access the most relevant information. This corresponds to the situation for OSLC queries with an oslc.select=* statement.

When I looked at the API first, I missed information about the valid headers. As far as I can tell, the valid headers are now updated in the wiki page. The supported and working headers are mentioned below as well.

Headers and performance

Which header to use seems to be a quite unimportant difference, but it can have a huge impact on the communication performance. Lets create a Gedankenexperiment. Lets assume we perform a request to the server. We want to search in the response data to get the URI to a query with specific property values like a project area, query name and creator. Lets assume the following characteristics for the request:

  • Result set size is 130 items
  • Paging size is 50 items

To process the response with the properties inline it is necessary to run 3 requests at most. If the desired information is on page 1, only one request would be necessary. Worst case all related data needs to be sent and received.

To process the data without the properties inline it is necessary to load a page. Then it is necessary to run another request for each of the URLs in the result set page and process the received data. In worst case this requires 3 + 130 = 133 requests. Worst case all related data needs to be sent and received as well.

Why is that a problem? Worst case, if the desired data is in the last link processed, both methods have required to transfer the same amount of data. In many cases one could argue the 1st approach would even require more data to be transferred on average than in the 2nd approach.

However, in my experience the information transfer between two network nodes is significantly slower than the time that would be required in a server to get the desired information. Each message that needs to be sent causes a considerable amount of overhead and delay. It is usually cheaper to send fewer messages than many small ones that transfer a similar amount of data, just because of the overhead required for each message. In addition all the subsequent requests for the details cause activity on the server which is likely to cause a lot more overhead compared to collecting all the information for the 3 initial request(s).

As an real world example, when I started to look into this, I did not know which headers where available. I used the headers I had often used e.g. ‘application/rdf+xml’. In addition, the query that I created had an issue. Instead of only returning the queries for a specific user in a specific project area, it returned all queries for the whole server. The responses only contained the query URLs. To get the name and creator of the query each URL had to be requested. It was also not obvious why the result set was so large and I had no code to recognize that results were related to project areas other than the expected. The server I ran against is on another continent and I was in my home office. I stopped the experiment after an hour or two.

At the end I was able to create solution that performs well and reduces overhead to a limit. But it was no means a simple process. I reached out to someone for hints. As a customer I would not have had the opportunity. This is why I try to finally share this information.

Stored Query API URL

I was not able to find a way to discover the URI to get the work item queries. The Wiki page mentions to find the simpleQuery tag in the project areas service provider, but I seem to be unable to find it, so I need to construct it based on the wiki page. From the Resource Oriented API page I can deduce two possible forms for the URL. Assuming PublicURI being the public URI of the CCM server e.g. ‘https://elm.example.com:9443/ccm’ the URL for the Stored Query Collection is either

PublicURI + '/oslc/queries'

or

PublicURI + '/oslc/queries.xml'

Both seem to work. Choose one you like and that works for you.

Query Stored Queries

The request below gets all the stored queries for a repository.

Get all stored queries for the repository

Note that the only Accept headers documented to be valid at the moment are:

Accept text/xml
Accept text/json

It is necessary to be authenticated to the ccm server to be able to perform the operation.

The image below shows a part of the response to the query above.

The response to the query for all stored queries

The first section shows the query for the next page in the attribute oslc_cm:next and the total result count in oslc_cm:toatlCount.

The rest of the information is one element rtc_cm:Query for each query descriptor. The element contains the query URI as rdf:resource, the identifier for the query, the Query name/title and description and the project area the query belongs to. The most important information is the rtc_cm:results element with the URL to execute the query. As an example the URL below:

https://elm.example.com:9443/ccm/oslc/queries/_87dqoFpmEeukW7cqqDjAuA/rtc_cm:results

Additional information is the date of the last modification and the user that created the query.

Note that the result is either XML or JSON. The tools for RDF that were used in some of the previous posts using the Python rdflib do not work in this context. Instead the Python code shown in the blog uses xml.dom.minidom to access the information in case of XML. The JSON code examples shown in previous posts using the json library can be used as a reference. This post will mostly use the Accept header text/xml. The code depends on the following libraries.

The main Libraries used to process the stored queries API.

Some example code for processing the data in Python will be shown below.

Narrow Down Stored Queries

It is usually not desirable to query for all stored queries for all project areas. A real production server can have hundreds of project areas, team areas and users that can have predefined or shared or personal queries defined. The result could be thousands to ten thousands of queries. To collect and transfer all that data is a challenge and will drain server resources as well.

It is possible to filter for all the properties that make up a query, identifier, name, creator, owning project area. The code below computes the query parameters from the parameters passed and creates a query section for the query URL.

Create the query parameters to narrow down the query results.

As an example the query that is created could look like below. The condition can narrow down to the project area, the user and the title, dependent on how much information is provided. If enough information is given, this would usually only return one hit. Note that this is a string terminated by the singe quote and the double quote characters inside are important. The query narrows down to one project area, the query name ‘All’ and the creating user.

'rtc_cm:projectArea="_8e5qfFpmEeukW7cqqDjAuA" and dc:title="All" and dc:creator="https://elm.example.com:9443/jts/users/ralph"'

How to discover the project area UUID has been explained in previous posts e.g. about discovery. The code that was used here is the same used in the other posts.

The code blow shows how the query URL is composed.

Python code to compose the query parameters and to create the query URL

The parameters provided are the same as above, the project area, the query name and the query creator. The creator can be provided in different ways. It is possible to provide a user URI like

'https://elm.example.com:9443/jts/users/ralph'

It is also possible to provide the variable string below.

'{currentUser}'

The query mechanism replaces this variable with the URI of the user that runs the request. For example, the query below would return the query named All created by the current user.

'rtc_cm:projectArea="_8e5qfFpmEeukW7cqqDjAuA" and dc:title="All" and dc:creator="{currentUser}"'

The last three lines in the image above compose the query URL. First the base query URL is composed. This is the part of the query that returns every stored query. Then the query parameter section is composed by adding ‘?oslc_cm.query=’ and then adding the URL encoded query parameters. The resulting query URL looks like below.

'https://elm.example.com:9443/ccm/oslc/queries.xml?oslc_cm.query=rtc_cm%3AprojectArea%3D%22_8e5qfFpmEeukW7cqqDjAuA%22+and+dc%3Atitle%3D%22All%22+and+dc%3Acreator%3D%22%7BcurrentUser%7D%22'

Execute the Query for Stored Queries

The code below executes the query to search for stored queries.

The first line runs the query that was just created using the function findQueryResources().

Execute the query to find stored queries

This function is shown in the image below. It uses the Accept header ‘text/xml’. While there is a valid query URL for a page, it executes that query.

The code that executes the query for the stored queries and collects the results

It calls qeryQueryInfoXML() to process the page result data. This code is shown below. The function builds up two arrays, one containing the stored query URL and one the associated query name. It also gets the query URL for the next page, if there is any, and returns that.

Get the information about the queries from the response

All query pages are iterated and the results added to the respective arrays.

Evaluate the result and

At the end of executing findQueryResources, there is a list of query names and an associated list of query URLs. Dependent on the query parameters given, it could return many or no result. The code here as shown below focuses on using the query name, but it would be easy enough to extract more information per stored query to use different criteria to look into the result set. The code tries to find the first stored query returned with the given name.

Find the query to run and execute it

If a stored query has been found, that query is finally executed in the function execute_query_xml(). The code for this function is shown below. Again, this uses the Accept header ‘test/xml’ and the results are processed as XML.

Process each result page of the stored query

The code iterates the query pages and is similar to the code already shown above. The code passes each query result page to the function processQueryResultInlineXML(). The code is shown below. It also looks very similar to the code we have already seen. The difference is that the code below now works on work item resource information and not on query information.

Processing one page of the stored query with work item data inline.

The code analyzes the oslc_cm:Collection and gets the total result count and the next page (if there is one). Then it analyzes the inline XML data for all the work items. For each work item it gets some of the available information and prints that. It returns the work item URIs, the totals and the next page etc.

Executing Stored Query Using JSON

The code here shows a JSON based implementation of the functions to execute the stored query after finding it. The function execute_query_json() does the same the function execute_query_xml() does and looks very similar. It is only using JSON as format to parse the data. The Accept header used below is ‘text/json’.

Process all query pages in JSON

The function getQueryResultDetailsInlineJson() is called to process each page. Like its XML twin it gets the total count and the next page (if available). Then it gets details about the embedded inline work item data and logs it. Like the XML based code, the function builds an array with the work item URIs for the page. It also returns additional information such as the link to the next page.

Process one query page with JSON work item data inline

More information

Please check the section Getting Resource Representations to understand additional ways to get work item data representations. Also check the section Getting Partial Representations to understand options to limit the data that needs to be transferred.

CSV Export

The query descriptor URL provided in the search by the tag rtc_cm:Query, can also be used in a browser to get the query displayed. As an example see the information below.

<rtc_cm:Query rdf:resource="https://elm.example.com:9443/ccm/resource/itemOid/com.ibm.team.workitem.query.QueryDescriptor/_pdT5AATBEeyiDIlAjMF45Q">

Opening the URL in the rdf:resource results in the following display in the browser.

The query descriptor URL is opened in the browser and executes the query.

It is also possible to use that URL to download the query result as CSV file. By adding the query parameter media type e.g.

?_mediaType=text/csv

to the query descriptor URL, the query can be changed to export the data as CSV file. As an example the URL below exports the query result to CSV.

https://elm.example.com:9443/ccm/resource/itemOid/com.ibm.team.workitem.query.QueryDescriptor/_pdT5AATBEeyiDIlAjMF45Q?_mediaType=text/csv

By pasting the URL in the browser, the CSV download is triggered. It is also possible to use the URL above in tools like cURL.

Summary

This blog post has described the steps that are necessary to find existing work item queries and how to run such queries to get the data. I hope that the examples here in my blog help users out there with their work.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.