Video: How KenSci uses machine learning and AI to predict end of life
Cloudera’s Impala open source project is now a public beta. The company also launches a real-time query services as its second subscription offering.
IBM Research has launched a cloud analytics service to connect apps with a range of big geospatial datasets, covering maps, satellite, weather, and population changes.
The service, dubbed PAIRS Geoscope, is available for developers to use IBM’s REST API for the service to add geospatial and time-based data to their own apps.
PAIRS Geoscope has been restricted to scientists, but IBM Research has now launched a website for all developers to test out the resource.
PAIRS stands for Physical Analytics Integrated Repository and Services and is IBM’s answer to the challenge of blending large structured datasets, such as satellite and weather data, with unstructured data, such as location and timestamp data in tweets.
IBM Watson researchers first described its PAIRS integration engine in a 2015 paper, noting that it was built on big-data technologies Hadoop and HBase.
PAIRS is supposed to take the “dirty work” out of data acquisition and searches for insights across multiple data sources in multiple data formats.
IBM’s PAIRS has a number of available datasets, including data from NASA’s Aqua and Terra satellites, US government data on soil, NOAA weather forecasts, US Geological Survey Landsat data and more.
Google launched its geospatial service in 2016, offering developers access to Landsat and the EU’s Sentinel-2 satellite images in 2016, which combined contained nearly 1.5 petabytes of data to work with.
The two key Google Earth datasets were introduced to Google Cloud, allowing developers to build forecasting services with its machine-learning and compute-engine tools.
In 2016, IBM researchers also began collecting drone images of earth from a DJI Phantom 3 Standard and uploading it to PAIRS where the images are matched with other data sources to overlay them with soil property, satellite and weather data.
IBM says its PAIRS platform has “many petabytes of data” but declined to provide a figure for the combined size of the datasets.
PAIRS users can also upload proprietary data to combine with existing data layers, for example, to combine IoT sensor data. This feature could be useful in IoT deployments that measure soil moisture to predict irrigation needs.
Indeed, PAIRS’ roots can be traced back to an IoT precision irrigation system that IBM assisted US mega-vineyard E J Gallo Winery to develop.
Hundreds of sensors, satellite imagery, and a cloud communications network combined with weather, meteorological and atmospheric data help it monitor vegetation, estimate water loss and predict future irrigation needs.
IBM says it has trial deployments of the platform with clients in agriculture, finance, energy, and meteorology.
The company says the PAIRS repository is growing by terabytes every day. It can “automatically ingest, curate, and seamlessly integrate all forms of geospatial-temporal data”, according to IBM, turning large, heterogeneous, and complex datasets “into a tidy aligned and indexed structure designed for efficient retrieval and query”.
Video: IBM PAIRS Geoscope: Exploring Insights from geospatial-temporal data.
Previous and related coverage
Here’s a look at the annual run rates, hybrid cloud strategies and approaches to artificial intelligence and machine learning among the public cloud providers.
IBM invites customers to build applications on its new 20-qubit system, as 50 qubits loom on the horizon.
IBM’s plan is to create a data science operating system that can bring together data scientists, analysts, and business leaders.
IBM is betting that its Cloud Private platform can be the middleware and platform architecture connecting data center hardware of all stripes with a cloud operating model.