Randy Zwitch
Randy Zwitch

JupyterCon 2018 Wrap-up: MapD Kernel for Jupyter

A couple of weeks ago I had the pleasure of speaking at JupyterCon 2018 in NYC. Thanks to hosts O’Reilly and NumFocus, the talk was recorded, so if you missed how MapD is making Python and Jupyter first-class citizens on the MapD platform, here’s your chance to watch it at your leisure.

Talk Highlights

pymapd: All of the heavy lifting for using MapD from Python is accomplished using pymapd. Pymapd is a DBI-compliant package for working with MapD at its lowest level: uploading data into tables, submitting SQL queries and general database management.

My colleague and VP of Product Management Venkat Krishnamurthy wrote a blog post showing how to get started with pymapd, as well describing how MapD is using Apache Arrow as part of the GPU Open Analytics Initiative.

Ibis: Building upon pymapd is Ibis, a pandas-like deferred expression system for working with data stored in databases. In Ibis version 0.14.0, MapD support was added, so that no longer is it a requirement to write SQL to have lightning fast analytics.

See my blog post Scaling Pandas to the Billions With Ibis and MapD to see a full example of the Ibis workflow with MapD.

%%mapd: The first new open-source MapD functionality debuted during this talk was a prototype of a Jupyter magic for submitting queries without having to worry about escape-quoting SQL strings or worrying about wrapping everything in a Python function. As you’ll see from the video, I was defining the magic code right in the notebook, but work towards this functionality is in the ipymapd GitHub repository.

jupyterlab-mapd: The other functionality debuted during the talk was work towards a JupyterLab plugin, allowing JupyterLab to be used as a full analytics IDE. The goals of this exploration were to use the MapD backend rendering engine to render visualizations, then return a picture back to JupyterLab, rather than streaming data to Jupyter and having JavaScript render the graphics. This will allow for larger-scale visualizations (and faster!) than would be possible in a local web browser. Work towards this functionality is in our jupyterlab-mapd GitHub repository.

We want to hear from you!

Internally, we’re all super excited by the open-source tooling that is being created for MapD, and we wouldn’t be anywhere without the help of Quansight doing most of the heavy lifting. But like any open-source project, we can only be as successful as the community members outside of our internal core developers. It’s your real-world use cases that help all of us at MapD smooth out the rough edges and make MapD Core the absolute best it can be.

So, what functionality are you looking for? Whether it’s Jupyter-specific functionality, Python, Java, or any other language or toolset, we would love to know what you think. We’re currently running a Community Survey for feedback, but please feel free to stop by our Community Forum for any questions or comments or leave an issue on any of the MapD GitHub repos if you encounter problems

Randy Zwitch

About the Author

Randy Zwitch is a Senior Developer Advocate at OmniSci, enabling customers and community users alike to utilize OmniSci to its fullest potential. With broad industry experience in Energy, Digital Analytics, Banking, Telecommunications and Media, Randy brings a wealth of knowledge across verticals as well as an in-depth knowledge of open-source tools for analytics.