Joe Conley Tagged json Random thoughts on technology, books, golf, and everything else that interests me http://www.josephpconley.com/name/json Everything is a Function <p>I’ve been working on a platform that would help free us from our <a href="http://www.josephpconley.com/2014/12/08/apps-are-dead.html">dependence on apps</a>. I’m hoping this ends up being a <a href="http://www.joelonsoftware.com/items/2012/01/06.html">horizontal tool</a> with a combinatorial number of use cases (hence the name DataCombinator). The main idea is to turn all digital interactions into simple functions (a la Excel) which can be composed to do cool stuff.</p> <p>Here are the main concepts you need to know to understand how to execute and compose these functions</p> <ul> <li>HTTP methods/requests and APIs (<a href="http://www.restapitutorial.com/">tutorial</a>)</li> <li>JSON (<a href="http://www.w3schools.com/json/default.asp">tutorial</a>)</li> <li>JSONPath (<a href="http://goessner.net/articles/JsonPath/">tutorial</a>)</li> <li>Handlebars.js for templating (<a href="http://handlebarsjs.com/">tutorial</a>)</li> </ul> <p>That’s it. Composing functions is as easy as writing a sequence of commands (separated by semi-colon). Each function will return a valid JSON value, which will be passed to the context of the subsequent function. <strike>The "current" JSON value can be accessed using the `_` character, and can also be used in argument strings using the Handlebars.js syntax.</strike></p> <p><strong>UPDATE</strong> As I was going through some various examples it became clear to me that I needed both the most recent JSON value as well as past ones, so to that end I’ve modified the scripting language to use <code class="highlighter-rouge">this</code> as the current JSON value and <code class="highlighter-rouge">_</code> to represent the array of all results in the script (with the 0th entry being the JSON value representing arguments passed to the script). I’ve updated the examples below accordingly.</p> <h2 id="examples">Examples</h2> <p>To quote George R. R. Martin, “Words are wind”, so let’s look at some actual code. You can test them out and see the full list of available functions <a href="http://www.datacombinator.com/worksheet">here</a>. If you’re confused by any of the following examples, try out the worksheet as each function call result is shown.</p> <h3 id="weather-alerts">Weather Alerts</h3> <p>Here’s an example of a custom weather alert service (one of the <a href="https://ifttt.com/recipes/popular">most popular IFTTT recipes</a>).</p> <script src="https://gist.github.com/josephpconley/adcca77c201b3f5a55b7.js"></script> <p>This script does the following:</p> <ul> <li>Performs a geo lookup using <a href="https://developers.google.com/maps/documentation/geocoding/">Google Maps</a> to get the geocoding details of the address we’re interested in (in this case my office address). <ul> <li>I added a Handlebars.js helper called <code class="highlighter-rouge">urlEncode</code> to facilitate things like url encoding</li> </ul> </li> <li>Navigates the result to find the latitude and longitude</li> <li>Gets the short-term weather forecast from <a href="https://developer.forecast.io/">Forecast.io</a> (sign up to get your own api key) using the <code class="highlighter-rouge">lat</code> and <code class="highlighter-rouge">long</code> from the previous call (using Handlebars to pass the values)</li> <li>Sends an SMS message to my phone if it’s below 40</li> </ul> <p>Pretty straightforward. The beauty of this is we can customize it to our preferred notification channel, and could easily swap out the SMS call with e-mail or Twitter.</p> <h3 id="auto-generate-rss-based-on-website-updates">Auto-generate RSS based on website updates</h3> <p>Despite the demise of Google Reader, I’ve always been a big fan of RSS. It tends to be a less noisy channel of information than social media, allowing for less frequent but longform communication. This example checks the AV Club for reviews of new episodes for a given show (I chose game-of-thrones-experts for this example, you’ll have to inspect the URL for your show) and generates an RSS feed.</p> <script src="https://gist.github.com/josephpconley/11a556fbecd20f159eb8.js"></script> <ul> <li>Performs a web request to get the html source code of the webpage</li> <li>Performs an XPath query to extract the HTML elements we need (and implicitly converts to JSON)</li> <li>Uses the resulting JSON to build an RSS xml file from a Handlebars.js template</li> </ul> <p>This has some more advanced concepts like <a href="http://www.w3schools.com/xpath/">XPath</a> (not to mention knowing proper RSS formatting), but the script is still relatively easy to understand.</p> <h3 id="website-monitoring">Website monitoring</h3> <p>As an owner/maintainer of a few websites, it’s important to know when any of them go down. There are several free services that will do this for you, but what if we wanted a customized view into our website uptime? Let’s write a script that monitors a <a href="http://www.swingstats.com/">really cool webapp you can use to track your golf scores and handicap, SwingStats</a></p> <script src="https://gist.github.com/josephpconley/798a18bc5f789d6747c4.js"></script> <ul> <li>Performs a web request to the site we’re monitoring</li> <li>Save the response info in a MongoDB database collection</li> <li>If the status is not 200 (OK), then send an e-mail</li> </ul> <p>This not only handles notifications but saves website data in a database (MongoDB). We can then build a separate script to access this data and calculate uptime percentage for SLA purposes.</p> <h2 id="next-steps">Next Steps</h2> <p>I hope these examples were interesting. DataCombinator will soon have the ability to save scripts, expose them as API endpoints, and schedule script execution on a specific time interval using <a href="http://en.wikipedia.org/wiki/Cron">Cron expressions</a>. The next version willl also expose your favorite social network functions, which could lead to things like one massive interleaved activity stream for all of your social media sites. If you’re interested in updates, follow along on the <a href="http://www.josephpconley.com/tags/datacombinator/">DataCombinator tag</a> or <a href="http://www.datacombinator.com">sign-up for email updates</a>.</p> Mon, 26 Jan 2015 00:00:00 +0000 http://www.josephpconley.com/2015/01/26/everything-is-a-function.html http://www.josephpconley.com/2015/01/26/everything-is-a-function.html Query JSON/XML/CSV using SQL <p>Ever wish you could use your favorite query language across different data formats? Or get query results in several formats (XML, JSON, and CSV/XLS)? Then check out <a href="http://www.datacombinator.com/query">DataCombinator’s new query engine</a>.</p> <h2 id="data-sources">Data sources</h2> <p>You can copy and paste structured data manually, point to a URL, or connect to a database directly (H2, MongoDB, MySQL, or PostgreSQL). The engine hasn’t been optimized yet to handle large documents or tables so please be mindful.</p> <h2 id="query-languages">Query languages</h2> <p>The engine supports JSONPath (powered by <a href="http://www.josephpconley.com/2014/04/15/jsonpath-for-play.html">my open-source Play library</a>), XPath and SQL. You can use any of these languages to query data in any of the JSON, XML or CSV formats. Since JSONPath and XPath are fairly similar and straightforward, the more interesting use cases tend to involve SQL.</p> <h3 id="sql">SQL</h3> <p>The FROM CLAUSE isn’t necessary as the query only applies to one “table”, that is, the data being queried. For SQL to work against JSON, the JSON must be an array of objects, e.g.</p> <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="s2">"id"</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="s2">"Joe"</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="s2">"id"</span><span class="p">:</span><span class="mi">2</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="s2">"Janine"</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> </span></code></pre></div></div> <p>If the objects in the array have nested levels, each object will be flattened, and the keys concatenated with an “_”, e.g.</p> <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="s2">"id"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Joe"</span><span class="p">,</span><span class="w"> </span><span class="s2">"address"</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="s2">"street"</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="s2">"123 Main St."</span><span class="p">,</span><span class="w"> </span><span class="s2">"city"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Springfield"</span><span class="p">,</span><span class="w"> </span><span class="s2">"state"</span><span class="p">:</span><span class="w"> </span><span class="s2">"PA"</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> </span></code></pre></div></div> <p>would be flattened to</p> <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="s2">"id"</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="s2">"Joe"</span><span class="p">,</span><span class="w"> </span><span class="s2">"address_street"</span><span class="p">:</span><span class="s2">"123 Main St."</span><span class="p">,</span><span class="w"> </span><span class="s2">"address_city"</span><span class="p">:</span><span class="s2">"Springfield"</span><span class="p">,</span><span class="w"> </span><span class="s2">"address_state"</span><span class="p">:</span><span class="s2">"PA"</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> </span></code></pre></div></div> <p>Similarly, an XML must be in a “table format” in order to handle a SQL query, e.g.</p> <div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nt">&lt;table</span> <span class="na">class=</span><span class="s">"ui table"</span><span class="nt">&gt;</span> <span class="nt">&lt;row&gt;</span> <span class="nt">&lt;id&gt;</span>1<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;name&gt;</span>Joe<span class="nt">&lt;/name&gt;</span> <span class="nt">&lt;/row&gt;</span> <span class="nt">&lt;row&gt;</span> <span class="nt">&lt;id&gt;</span>2<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;name&gt;</span>Janine<span class="nt">&lt;/name&gt;</span> <span class="nt">&lt;/row&gt;</span> <span class="nt">&lt;/table&gt;</span> </code></pre></div></div> <h3 id="supported-sql-functions">Supported SQL functions</h3> <p>The engine supports basic single-table query functionality (no self joins yet) with simple clauses (WHERE, GROUP BY, and ORDER BY) and a few basic aggregation functions (COUNT, MIN, MAX, SUM). I’ll be working to expand upon this, so if you have any requests <a href="http://www.datacombinator.com/contact">let me know</a>.</p> <h2 id="query-results">Query results</h2> <p>The query engine outputs results in JSON, XML, and CSV/HTML Table/Excel if the resulting structure can be converted to a table structure.</p> <h2 id="examples">Examples</h2> <p>Here’s a few examples where I’ve found the query engine helpful.</p> <h3 id="espn-apis---json">ESPN APIs - JSON</h3> <p>ESPN has released a <a href="http://developer.espn.com/docs">variety of APIs</a> that allow developers to access headlines and basic team statistics. You’ll need to create a free account and register for a key, at which point you’ll have immediate access to the Public APIs.</p> <p>So for example, if I wanted to find out stats on my beloved Philadelphia Phillies, I would enter http://api.espn.com/v1/sports/baseball/mlb/teams?apikey=MY_API_KEY as the URL in DataCombinator. Using the JSON Raw tab, I can see the pretty printed response, and quickly search on Phillies to find their id of 22. Using this id, I can get the latest news on the Phightins by using the URL of http://api.espn.com/v1/sports/baseball/mlb/teams/22/news?apikey=MY_API_KEY. I can then use JSONPath to only include the part of the response I want. For example, if I just want all the latest headlines associated with the Phillies, I take a quick look at the structure and apply the <code class="highlighter-rouge">$..headline</code> JSONPath query to return an array of headlines:</p> <div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span> <span class="s2">"Mets end 5-game skid, rally past Phils 5-4 in 11"</span><span class="p">,</span> <span class="s2">"Howard, Rollins lead Phillies past slumping Mets"</span><span class="p">,</span> <span class="s2">"Byrd's double lifts Phillies over Mets 3-2 in 11"</span><span class="p">,</span> <span class="s2">"The base: Approach at your own risk"</span><span class="p">,</span> <span class="s2">"Phillies fall to hot-hitting Blue Jays in 20,000th game"</span><span class="p">,</span> <span class="s2">"Adam Lind activated by Blue Jays"</span><span class="p">,</span> <span class="s2">"Mark Buehrle posts MLB-best sixth win as Blue Jays rock Phillies"</span><span class="p">,</span> <span class="s2">"Blue Jays edge Phillies on sac fly in 10th after blowing 5-run lead"</span><span class="p">,</span> <span class="s2">"Happ stifles Phillies, Blue Jays win 3-0"</span><span class="p">,</span> <span class="s2">"Hernandez outduels Gonzalez, Phillies edge Nats"</span> <span class="p">]</span> </code></pre></div></div> <h3 id="weather-data---xml">Weather Data - XML</h3> <p>OpenWeatherMap.org provides a <a href="http://openweathermap.org/API">free weather API</a> which returns data in XML format. For example, if I wanted to get the current weather in my hometown of Springfield, PA, I could use the URL</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>http://api.openweathermap.org/data/2.5/weather?q=Springfield&amp;mode=xml&amp;units=imperial </code></pre></div></div> <p>to get an XML document back. I could then query the document using XPath to get just the temperature via <code class="highlighter-rouge">//temperature</code>.</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;temperature max="71.52" min="71.52" unit="fahrenheit" value="71.52"/&gt; </code></pre></div></div> <h3 id="opendata---csv">OpenData - CSV</h3> <p>Public institutions are starting to embrace open data practices, enabling civic-minded hackers to build useful applications that provide a public service. In this spirit, the city of Philadelphia has made <a href="https://github.com/CityOfPhiladelphia">various data sets</a> available for public consumption. Most of these data sets are in CSV format. We’ll take one such data set, <a href="https://github.com/CityOfPhiladelphia/phl-site-stats">phl-site-stats</a>, and use the Raw url from Github to query it (I picked this dataset as it’s relatively small).</p> <p>We’ll take a look at the latest month’s stats found at <a href="https://raw.githubusercontent.com/CityOfPhiladelphia/phl-site-stats/master/SiteStats0514.csv">https://raw.githubusercontent.com/CityOfPhiladelphia/phl-site-stats/master/SiteStats0514.csv</a>. Without entering a query, we would get the entire data set in the results. One point to note is that the query engine will try to convert strings to numbers, making it easy to query based on certain conditions. If we wanted to view the most popular sites for phila.gov, we would simply enter a query of</p> <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">select</span> <span class="o">*</span> <span class="k">order</span> <span class="k">by</span> <span class="n">page_count</span> <span class="k">desc</span> </code></pre></div></div> <p>Or we could get the total number of unique hits for the month of May</p> <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">select</span> <span class="k">sum</span><span class="p">(</span><span class="n">unique_page_count</span><span class="p">)</span> </code></pre></div></div> <h2 id="next-steps">Next steps</h2> <p>This query engine will be the foundation of DataCombinator’s platform of data collection and composition tools. Our next step is to not only host structured data via API endpoints, but to also combine multiple datasources into one document (which in turn would be hosted as well!). If you’re interested in learning more, <a href="http://www.datacombinator.com">sign up</a> for e-mail updates or <a href="https://www.twitter.com/DataCombinator">follow us on Twitter @DataCombinator</a>.</p> Tue, 13 May 2014 00:00:00 +0000 http://www.josephpconley.com/2014/05/13/datacombinator-query-engine.html http://www.josephpconley.com/2014/05/13/datacombinator-query-engine.html JSONPath Library for Play <p>I’ve been working on a platform that transforms, composes, and serves data. As part of this effort, I’ve developed a <a href="https://github.com/josephpconley/play-jsonpath">library for Play</a> that performs a JSONPath query on a Play JsValue. You can learn about JSONPath by reading <a href="http://goessner.net/articles/JsonPath/">Stefan Goessner’s blog post</a> on the subject.</p> <p>I use <a href="https://github.com/gatling/jsonpath">Gatling’s jsonpath library</a> to parse the JSONPath expression. I then fold over the tokens, performing a pattern match on each to construct the apporpriate JsValue. This parser supports all queries except for queries that rely on expressions of the underlying language like <code class="highlighter-rouge">$..book[(@.length-1)]</code>. However, there’s usually a ready workaround as you can execute the same query using <code class="highlighter-rouge">$..book[-1:]</code>.</p> <h2 id="example">Example</h2> <p>Here’s a scala worksheet which traces the examples on Stefan’s post:</p> <script src="https://gist.github.com/josephpconley/10647739.js"></script> <h2 id="deviation-from-jsonpath">Deviation from JSONPath</h2> <p>One conscious choice I made as far as deviating from JSONPath is to always flatten the results of a recursive query. Using the bookstore example, typically a query of <code class="highlighter-rouge">$..book</code> will return an array with one element, the array of books. If there was another book array somewhere in the document, then <code class="highlighter-rouge">$..book</code> will return an array with two elements, both arrays of books. However, if you were to query <code class="highlighter-rouge">$..book[2]</code> for our example, you would get the second book in the first array, which assumes that the <code class="highlighter-rouge">$..book</code> result has been flattened. In order to make recursion easier and the code simpler, I always flatten the result of recursive queries regardless of the context.</p> <p>If you have any questions, comments, or suggestions please let me know. I hope to be introducing an early iteration of my data platform shortly so stay tuned!</p> Tue, 15 Apr 2014 00:00:00 +0000 http://www.josephpconley.com/2014/04/15/jsonpath-for-play.html http://www.josephpconley.com/2014/04/15/jsonpath-for-play.html