Creating a REST API: Manual Pagination, Sorting, and Filtering

By August 14, 2018 Uncategorized

At this point in the series, the REST API supports basic CRUD capabilities on the employees endpoint. However, clients often need some control over how multiple records are fetched from the database. In this post, you’ll make the API more flexible by adding pagination, sorting, and filtering capabilities.

Overview

Currently, when an HTTP GET request is issued on the employees route, all of the table’s rows are returned. This may not be a big deal with only 107 rows in the HR.EMPLOYEES table, but imagine what would happen if the table contained thousands or millions of rows. Clients such as mobile and web apps generally consume and display only a fraction of the rows available in the database and then fetch more rows when needed – perhaps when a user scrolls down or clicks the “next” button on some pagination control in the UI.

To allow for this, REST APIs need to support a means of paginating the results returned. Once pagination is supported, sorting capabilities become important as data usually needs to be sorted prior to pagination being applied. Additionally, a means of filtering data is very important for performance. Why send data from the database, through the mid-tier, and all the way to the client if it’s not needed?

I will use URL query string parameters to allow clients to specify how results should be paginated, sorted, and filtered. As is always the case in programming, the implementation could vary depending on your requirements, performance goals, etc. In this post, I’ll walk you through a manual approach to adding these features to an API. This approach provides very granular control but it can be laborious and repetitive, so I’ll show you how a module can be used to simplify these operations in a future post.

Pagination

The query string parameters I will use for pagination are skip and limit. The skip parameter will be used to skip past the number of rows specified while limit will limit the number of rows returned. I’ll use a default of 30 for limit if a value isn’t provided by the client.

Start by updating the controller logic to extract the values from the query string and pass them along to the database API. Open the controllers/employees.js file and add the following lines of code in the get function, after the line that parses out the req.params.id parameter.

Now the database logic needs to be updated to take these values into account and update the SQL query accordingly. In SQL, the offset clause is used to skip rows and the fetch clause is used to limit the number of rows returned from a query. As usual, the values will not be appended directly to the query – they will be added as bind variables instead for performance and security reasons. Open db_apis/employees.js and add the following code after the if block in the find function that appends the where clause to the query.

That’s all you need to do for pagination! Start the API and then run a few cURL commands in another terminal to test it. Here are a few examples you can use:

With pagination now working, you may already see the importance of being able to sort the data before pagination is applied. You will add sorting in the next section.

Sorting

At a minimum, clients should be able to specify the column to sort by and the order (ascending or descending). The simplest way to do this is to define a query parameter (I’ll use sort) that allows a string like ‘last_name:asc’ or ‘salary:desc’ to be passed in. Of course, you could take this further, perhaps allowing clients to sort by multiple columns, control how nulls are treated, etc. I’ll keep things simple and only allow clients to specify a single column and direction as above.

In SQL, the order by clause is used to sort data. Unfortunately, it is not possible to bind in the column name in the order by clause of a SQL query as it’s considered an identifier rather than a value. This means you’ll need to be very careful when appending the column name and direction to the query to prevent SQL injection. You could sanitize the values passed in or compare them against a whitelist of values. I’ll use the whitelist approach as it provides more control than generic sanitization.

One last thing before we get to the code… The only way to guarantee the order of a result set returned from a SQL query is to include an order by clause. For this reason, it’s a good idea to have a default order by clause defined to ensure consistency when the client doesn’t specify one.

Return to the controllers/employees.js file and add the following line of code in the get function, after the line that parses out the req.query.limit parameter.

Next, open db_apis/employees.js and add the following line below the lines that declare and initalize baseQuery.

sortableColumns is the whitelist of columns that clients will be able use for sorting. Next, inside the find function, add the following if block which appends the order by clause. This needs to be done after the where clause is added, but before the offset and fetch clauses.

The first part of the if block checks to see if the client passed in a sort value. If not, a default order by clause that sorts by last_name in ascending order is appended to the SQL query. If a sort value is specified, then it’s first broken up into the column and order values and each value is validated before the order by clause is appended to the query.

Now you can restart the API and run some cURL commands to test it. Here are some examples to try out:

The last two examples should throw exceptions because they contain values that were not whitelisted. Currently, Express’ default error handler is being used, which is why the error is returned as an HTML web page. I’ll show you how to implement custom error handling in a future post.

Filtering

The ability to filter data is an important feature that all REST APIs should provide. As was the case with sorting, the implementation can be simple or complex depending on what you want to support. The easiest approach is to add support for equals filters (e.g. last_name=Doe). More complex implementations may add support for basic operators (e.g. <, >, instr, etc.) and complex boolean operators (e.g. and & or) that can group multiple filters together.

In this post, I’ll keep things simple and only add support for equals filters on two columns: department_id and manager_id. For each column, I’ll allow for a corresponding parameter in the query string. The database logic that appends a where clause when GET requests are issued on the single-employee endpoint will need to be updated to allow for these new filters.

Open controllers/employees.js and add the following lines below the line that parses out the value of req.query.sort in the get function.

Next, edit db_apis/employees.js by adding a 1 = 1 where clause to the baseQuery as seen below.

Of course, 1 = 1 will always resolve to true so the optimizer will just ignore it. However, this technique will simplify adding additional predicates later on.

In the find function, replace the if block that appends the where clause when a context.id is passed in with the following lines.

As you can see, each if block simply adds the value passed in to the binds object and then appends a corresponding predicate to the where clause.

Save your changes and then restart the API. Then use these cURL commands to test it:

And there you have it – the API now supports pagination, sorting, and filtering! The manual approach provides a lot of control but requires a lot of code. The find function is now 58 lines long and it only supports limited sorting and filtering capabilities. Of course, there are ways to make this easier, though you may have to sacrifice some control.

This post is the last “core” post in the series on building a REST API with Node.js and Oracle Database. However, I will extend this series with a number of follow-up posts that cover a variety of REST API and Oracle Database features. The first such follow-up post will show you how to use a module to simplify and standardize pagination, sorting, and filtering. Stay tuned!

2 Comments

  • It is possible to bind column names for ORDER BYs if you use a CASE statement:

    sql = select first_name, last_name, department_id, salary
    from employees
    order by
    case :ob1
    when 'FIRST_NAME' then first_name
    else last_name
    end,
    case :ob2
    when 'DEPARTMENT_ID' then department_id
    else salary
    end
    ;

    Note this ends up using a ‘whitelist’ approach, as shown in the rest of the article. This is important for security.

    Whether you want to bind like this, or not, is another matter. Checking the SQL optimizer plan before making a decision is wise.

    Either way, the code in the article results in lots of different variants of SQL statements potentially being executed. Users should check the node-oracledb statement cache size is tuned appropriately, see https://oracle.github.io/node-oracledb/doc/api.html#stmtcache

    • danmcghan says:

      Yeah, this gets really complex once you add support for more columns, controlling the order, and multi-column sorts. 🙂

      I agree that this code could result in many different SQL statements. But since this is the manual approach, the author could make edits to restrict the variants as needed. In other words, it’s easy to optimize because there’s no black-boxy magic going on.

Leave a Reply