1. Data volumes will continue to increase and migrate to the cloud
The majority of big data experts agree that the amount of generated data will be growing exponentially in the future. In its Data Age 2025 report for Seagate, IDC forecasts the global datasphere will reach 175 zettabytes by 2025. To help you understand how big it is, let’s measure this amount in 128GB iPads. In 2013, the stack would have stretched two-thirds of the distance from the Earth to the Moon. By 2025, this stack would have grown 26 times longer.
What makes experts believe in such a rapid growth? First, the increasing number of internet users doing everything online, from business communications to shopping and social networking.
Second, billions of connected devices and embedded systems that create, collect and share a wealth of IoT data analytics every day, all over the world.
As enterprises gain the opportunity for real-time big data an, they will get to create and manage 60% of big data in the near future. However, individual consumers have a significant role to play in data growth, too. In the same report, IDC also estimates that 6 billion users, or 75% of the world’s population, will be interacting with online data every day by 2025. In other terms, each connected user will be having at least one data interaction every 18 seconds.
Such large datasets are challenging to work with in terms of their storage and processing. Until recently, big data processing challenges were solved by open-source ecosystems, such as Hadoop and NoSQL. However, open-source technologies require manual configuration and troubleshooting, which can be rather complicated for most companies. In search for more elasticity, businesses started to migrate big data to the cloud.
AWS, Microsoft Azure, and Google Cloud Platform have transformed the way big data is stored and processed. Before, when companies intended to run data-intensive apps, they needed to physically grow their own data centers. Now, with its pay-as-you-go services, the cloud infrastructure provides agility, scalability, and ease of use.
This trend will certainly continue into the 2020s, but with some adjustments:
- Hybrid environments. Many companies can’t store sensitive information in the cloud, so they choose to keep a certain amount of data on premises and move the rest to the cloud.
- Multi-cloud environments. Some companies wanting to address their business needs to the fullest choose to store data using a combination of clouds, both public and private.
2. Machine learning will continue to change the landscape
Playing a huge role in big data, machine learning is another technology expected to impact our future drastically.
Machine learning is becoming more sophisticated with every passing year. We are yet to see its full potential—beyond self-driving cars, fraud detection devices, or retail trends analyses.
Wei Li
Vice President and General Manager at Intel
Machine learning is a rapidly developing technology used to augment everyday operations and business processes. ML projects have received the most funding in 2019, compared to all other AI systems combined:
Not until recently, machine learning and AI applications have been unavailable to most companies due to the domination of open-source platforms. Though open-source platforms were developed to make technologies closer to people, most businesses lack skills to configure required solutions on their own. Oh, the irony.
The situation has changed once commercial AI vendors started to build connectors to open-source AI and ML platforms and provide affordable solutions that do not require complex configurations. What’s more, commercial vendors offer the features open-source platforms currently lack, such as ML model management and reuse.
Meanwhile, experts believe that computers’ ability to learn from data will improve considerably due to the application of unsupervised machine learning approach, deeper personalization, and cognitive services. As a result, there will be machines that are more intelligent and capable to read emotions, drive cars, explore the space, and treat patients.
What fascinates me is combining big data with machine learning and especially natural language processing, where computers do the analysis by themselves to find things like new disease patterns.
Bernard Marr
Author, Big Data: Using SMART Big Data, Analytics and Metrics to Make Better Decisions and Improve Performance
This is intriguing and scary at the same time. On the one hand, intelligent robots promise to make our lives easier. On the other hand, there is an ethical and regulatory issue, pertaining to the use of machine learning in banking for making loan decisions, for example. Such giants as Google and IBM are already pushing for more transparency by accompanying their machine learning models with the technologies that monitor bias in algorithms.
3. Data scientists and CDOs will be in high demand
The positions of Data Scientists and Chief Data Officers (CDOs) are relatively new, but the need for these specialists on the labor market is already high. As data volumes continue to grow, the gap between the need and the availability of data professionals is already large.
In 2019, KPMG surveyed 3,600 CIOs and technology executives from 108 countries and found out that 67% of them struggled with skill shortages (which were all-time high since 2008), with the top three scarcest skills being big data/analytics, security, and AI.
No wonder data scientists are among the top fastest-growing jobs today, along with machine learning engineers and big data engineers. Big data is useless without analysis, and data scientists are those professionals who collect and analyze data with the help of analytics and reporting tools, turning it into actionable insights.
To rank as a good data scientist, one should have the deep knowledge of:
- Data platforms and tools
- Programming languages
- Machine learning algorithms
- Data manipulation techniques, such as building data pipelines, managing ETL processes, and prepping data for analysis
Striving to improve their operations and gain a competitive edge, businesses are willing to pay higher salaries to such talents. This makes the future look bright for data scientists.
Also, in an additional attempt to bridge the skill gap, businesses now also grow data scientists from within the companies. These professionals, dubbed citizen data scientists, are no strangers to creating advanced analytical models, but they hold the position outside the analytics field per se. However, with the help of technologies, they are able to do heavy data science processing without having a data science degree.
The situation is unclear with the chief data officer role, though. CDO is a C-level executive responsible for big data governance, availability, integrity, and security in a company. As more business owners realize the importance of this role, hiring a CDO is becoming the norm, with 67.9% of major companies already having a CDO in place, according to the Big Data and AI Executive Survey 2019 by NewVantage Partners.
However, the CDO position stays ill-defined, particularly in terms of the responsibilities or, to be more precise, the way these responsibilities should be split between CDOs, data scientists, and CIOs. It’s one of the roles that can’t be ‘one-size-fits-all’ but depends on the business needs of particular companies as well as their digital maturity. Consequently, the CDO position is going to see a good share of restructuring and evolve along with the world becoming more data-driven.