This is Data Insecurity, a blog series about all the ways organizations fail to keep data secure online.
What is data security? To me, data security is ensuring only authorized users can access certain content when users are using your website and when they might be accessing it programmatically. Websites that secure their data using usernames and passwords need to implement this same level of security against programmatic access, otherwise their security can be easily compromised.
I recently wanted to start volunteering with a local organization. I looked them up online, I found they maintained a calendar of shifts where they need volunteers but you have to create an account to access the calendar. This log-in indicates they either use the log-in to manage sign-ups for different shifts or they want certain information to only be accessible by certain, authorized users. Or both.
The organization used a website called volgistics.com
to manage their authentication and calendar. The Volgistics website advertises that “Volgistics uses the same level of security as online banking, and a secure data center to keep volunteer information safe.” I wanted to kick the tires on this claim and see how they were securing the organization and user data. To be fair, the website only claims to keep “volunteer” information safe, which I assume means personal information about individual volunteers, rather than non-profit organizational information (contact information, event dates, times and locations). I imagine some organizations assume that the login on the Volgistics website also allows them to post certain organizational information that they do not want broadcast to the whole world online. If this assumption is correct, then Volgistics should secure organizational information just as much as it secures volunteer information.
How does the website secure data?
We can open Chrome Developer Tools to see how the website sends and receives information from the website servers. Specifically, we want to navigate to the Network tab to see the network traffic passing between the frontend and the backend; this page shows the data passing between your computer and the website servers. Each piece of communication is called a “request” because your computer is requesting a piece of information from the website’s servers. The website can either provide the requested information back or return an error (you might be aware of 404 of 500 errors); how the website responds to a request is called a “response”. A request includes all information needed to properly request information from the backend, including a full URL, a payload, and optional headers.
With the network tab open, we log into the website. We see a Request URL indicating a login: https://www.volgistics.com/api/vicnet/auth/log-in?platform=web
. The request is a POST request meaning the request included a payload of information needed to tell the server what information we want back. In this situation, the payload looks like this:
{"email":"EMAIL","password":"PASSWORD","FROM":ACCOUNT_NUMBER}
The website will use this payload to try to log me in. The Preview sub-tab shows the server response. In this specific request, the Preview is a dictionary containing three keys: jwt
, organizations
, and vServer
. jwt
stands for JSON Web Token and is a common form of authentication in modern web apps. After logging in, the JWT serves as authentication in every subsequent request and is most commonly provided in the Authorization Header. In the Headers sub-tab, we can see the JWT is passed as the value in the authorization
header and is prefixed by Bearer
(which indicates that the user making the request is bearing
or holding
the token to gain access to certain information on the server).
In addition to the JWT, we also see the x-api-key
request header which is not a default header. The presence of this header indicates that the value is important for accessing server data. I have been unable determine which response sent the api key from the server to the client so the client could send it back in subsequent requests.
To confirm this authorization scheme, I attempted to request the same data using Postman and I only received a successful response when I provided a JWT token and the x-api-key
header. I’ll call these two items the “authentication tokens” because they are required for accessing certain data.
what data can we not access and why?
The website requires the JWT and API key on all requests for volunteer data. The network request for the volunteer profile page does not include the volunteer account ID in the request payload. This leads to the question, how does the website know which volunteer profile to load? The answer is the same as the authentication methods; the server must keep a record of the specific JWT token and/or API key for the authenticated user to link the authentication methods to the actual user information. This seems like a strong security measure because there is no easy way to obtain another user’s JWT or API key because they are truly random.
what data can we access without the authentication tokens?
After logging in, I navigated around the website to see the network traffic for the different pages. Specifically, I wanted to determine where and how the website loads information about individual volunteers (such as a volunteer profile or settings page) and information about the organization seeking volunteers (the who, what, where, and when of actual volunteer events).
I found some requests that did not include the authentication tokens once I navigated to the organization’s volunteer calendar page. I specifically identified information about the volunteer organization and their volunteer events. These requests URLs were not associated with the website api/
URL path, instead they are on dll
URLs, indicating they are using a Microsoft .NET backend to respond to these requests. I could not find an easy way to identify all the API endpoints, but navigating the volunteer opportunities revealed two exposed endpoint: getAccount
and getProfile
. The first endpoint returns the organization’s account information (including name, logo and other metadata), including a collection of volunteer opportunities, called “listings”. The second endpoint returns the volunteer event information, including the date, time, location, qualifications, duties, and directions.
How could they secure the insecure data?
To systematically scrape account numbers and events, I relied on a basic sequential numbering system, where the first account is number 1, the second is 2, etc. In database terminology, this numbering system is known as an “auto increment” because the numbering system will automatically increase (increment) by one with each new item. Not every account number in the numbering system returned an actual account; it seems the numbering system skips some numbers (or possibly the account numbers are for accounts that have since been deleted). Despite this, I was able to simply iterate through a list of account numbers in ascending order until I concluded that I had reached the highest number.
An auto increment is a straightforward means to ensure each item has a unique number but it is highly predictable. To secure this data, I recommend switching from an auto increment to a random Universally Unique Identifier (“UUID”) because UUIDs are unpredictable by their very nature. For more information comparing Auto Increments and UUIDs, please see this article.