The convergence of IoT and big data promises tremendous new business value and opportunities for enterprises across all major industries. Critical to unlocking the value from all this newly created IoT data is data science and IoT analytics. As IoT big data scales exponentially, enterprises look to GPU-accelerated IoT analytics to handle the torrents of data and bring zero-latency exploration to IoT data that has traditionally been stored and left in the dark.

The Convergence of IoT and Big Data

Over the past decade, IoT has grown from niche to necessity. The interconnectivity between conventional computers and mobile devices sparks curiosity about non-internet-enabled physical devices and everyday objects, from automobiles to toaster ovens. Emergent fields such as Industrial Internet of Things (IIoT) and Internet of Wearable Things (IoWT) demonstrate how IoT has become an intrinsic aspect of our economy, from manufacturing to consumption.

Concurrent with the explosive growth of IoT is big data. As enterprises wrangle with new technologies aimed at capturing the volume, velocity, variety, and veracity of their data, they can also discover new value by using IoT data analytics. Business Intelligence (BI) and data visualization solutions continue delivering new insights and new ways of solving old problems. As enterprises capture greater value from their ever-growing data warehouses and lakes, they recognize IoT as the next great frontier in big data analytics.

IoT and big data are now inseparable. The convergence of these two technologies compounds their business value and opportunity for new business use cases that drive innovation across every industry. New sensors, instruments, and an array of other connected devices now continuously stream invaluable IoT data about the world around us. By 2020, an estimated 4.4 trillion GB of data will be generated per year. And the value of this data will only continue to grow. The challenge now is IoT data analytics and visualization at scale.

 

NVIDIA Showcases the Power of OmniSci at GTC
We want to accelerate data scientists work, by giving them the instrument of their science, so they can accomplish their life's work as quickly as possible.
- Jensen Huang CEO, NVIDIA

Latency Kills IoT Data Analytics

With the exponential growth of data streams comes new IoT data challenges: analyzing and visualizing that data at scale. Popular spreadsheet tools max out at 100,000 rows or fewer. Mainstream IoT analytics tools are usually capable of greater scale, but the query times leave much to be desired. Data scientists and analysts can wait five minutes, or even five hours, for a query to return. This is because of the scale of data, owing largely to the unexpected growth of IoT, has dramatically outpaced the growth of processing power. Moore’s law is no longer the law of the land.

This is critically important in data science and IoT, as data scientists and analysts become reluctant to run additional queries because of the wait time. Their data exploration is interrupted, their thought-processes impeded by another trip to the coffee machine while another query slowly spins to life. It’s frustrating and completely antithetical to the analytical process. Hypotheses go untested. Data goes unexplored.

As a result of these wait times, data engineers have come up with ways of circumventing the limits of CPU processing. They take averages of the data, or they sample small percentages of datasets to extrapolate results. But this is completely inappropriate to the world of IoT data, where there is great value to be had in single location and time events, events that would otherwise be washed out by bad data science.

As the limits of CPUs become more evident, a growing number of forward-thinking data engineers and data scientists look to graphics processing units, or GPUs, to provide incredible acceleration at the incredible scale of IoT big data.

The Shift from CPU to GPU

GPU-acceleration is a fundamental shift in the world of data science and analytics. In this world, mainstream CPU-based analytics tools still reign supreme, but not for much longer. These tools consist of the common BI and data visualization solutions, as well as analytics tools for Geographic Information Systems (GIS). These are feature-rich tools, primarily designed to provide self-service reporting dashboards, drill-down, and visualization capabilities to a lot of workers.

CPU-based analytics tools typically rely on underlying processing technologies and require complex, expensive system architectures and data pipelines to support them. Even then, these CPU-based analytics tools are slow, especially for IoT big data analysis. Data scientists and analysts are accustomed to long query times, from five minutes to five hours or more. As IoT big data scales exponentially, these query times become longer. The CPU hardware footprint becomes cost-prohibitively larger and more complex. These hindrances are why enterprises now look to GPU-accelerated analytics.

GPU acceleration provides 1000x the speed of normal queries, at a fraction of the hardware footprint. This is because GPUs are designed to quickly render high resolution images and video through parallel operations on multiple sets of data. GPUs now drive GPU databases, which we go into much greater detail about on our Introduction to GPU databases page.

 

GPU-accelerated Analytics for VAST IoT Data

With the shift from CPU to GPU-based analytics solutions, enterprises are unlocking new business use cases around data that was once simply too big or streaming too fast to analyze. With the inherent spatiotemporal (location and time) component of IoT data, as well as the ability of GPUs to plot billions of points and render complex polygons, the business value of GPU-accelerated analytics on IoT analytics use cases grows exponentially.


IoT business analytics use cases can be characterized by the acronym VAST: Volume & Velocity, Agility, and Spatiotemporal.

Volume & Velocity

IoT data has tremendous volume and velocity. Data collection in IoT now streams in from a rapidly growing number of IoT sensors, clickstream data, server logs, transactions, and telematics data generated from moving objects, like mobile devices, cars, trucks, aircraft, satellites, and ships. Often this data is pouring in at millions of records, or more, per second. Tables of IoT streaming data often range from tens of millions to tens of billions of rows. Sometimes hundreds of billions.

Agility

IoT data is massive: it simply overwhelms traditional CPU architectures, forcing ever-expanding hardware footprints. To compensate on its limitations, engineers downsample, index, or pre-aggregate the data. This is wholly antithetical to the value proposition of IoT analytics use cases, which often require the agility to identify a single spatiotemporal event amidst billions of other events, not the average of a billion, or a sample of a billion events.

Spatiotemporal Data

At least 80% of data records created today contain spatiotemporal data. For IoT data, that percentage is even higher. Plotting these points is computationally intensive on CPU analytics tools, and rendering is near impossible. GPUs were designed to render graphically-intensive video games, ergo they can plot and render millions of spatiotemporal points in milliseconds.


To learn more, visit our GPU-accelerated Analytics Explainer.

"

 

The Future of IoT Data

The simultaneous growth of IoT and big data makes them inseparable. The evolution of cellular and wifi technologies promises a repeat of history. 5th Generation, or 5G mobile cellular networks are coming. They are designed to handle high data rates, reduce latency, reduce cost and energy consumption, achieve higher system capacity, and promote massive device connectivity.

The inexorable truth of 5G is a tsunami of new IoT streaming data to analyze and visualize. Industries are already planning to leverage the superfast bandwidth speeds, ultra-low latency, and expanded geographic range of 5G to create more IoT sensors, instruments, and connected devices than ever before. 5G, IoT and big data will soon be inseparable.

GPU-accelerated analytics are a timely, necessary addition to the world of data science and analytics. OmniSci is the only viable IoT data platform capable of ingesting and visualizing this torrent of data for reliable IoT analytics. Read more about how OmniSci pioneers GPU-accelerated analytics.

 

The convergence of IoT and big data promises tremendous new business value and opportunities for enterprises across all major industries. Critical to unlocking the value from all this newly created IoT data is data science and IoT analytics. As IoT big data scales exponentially, enterprises look to GPU-accelerated IoT analytics to handle the torrents of data and bring zero-latency exploration to IoT data that has traditionally been stored and left in the dark.

 

Get the OmniSci Whitepaper

Learn more about the fastest open-source SQL engine and how you can use it to accelerate big data analytics.