Big Data is changing the way we interact with the world – and the way the world interacts with us. Big Data consists of the aggregation of large amounts of information that is analyzed in order to uncover hidden insights. The data rests within cloud-based servers or internal data warehouses. Big Data is amassed in massive volumes, at high-speed velocities, and in great variety. By measuring qualitative and quantitative inputs, sponsors and broker-dealers can make efficient evidence-based decisions. Sponsors and broker-dealers that adopt a metrics-based approach can increase the productivity of the firm and their affiliates. Most importantly for broker-dealers and sponsors, collecting and analyzing data can provide a competitive edge and greater efficiencies in raising capital and maintaining compliance.
“In addition, broker-dealers have started to use data compiled through these solutions to glean valuable insight into adviser performance.”(Michael Brodeur, Investment News)
Big Data Break Down
Big Data is composed of structured, semi-structured, and unstructured data. Structured data is in code format and is easily captured, stored, managed, and interpreted. Structured data consists of name, age, gender, numbers, dates, etc. Semi-structured data is a form of structured data which includes email, job titles, locations, project features, etc. It does not reside in the main database but is tagged to an associated file in case it needs to be accessed again. Including semi-structured data in a separate database frees space within the main data warehouse. Unstructured data is raw data that is harder to quantify and needs to be broken down into its individual elements (e.g. e-mail messages, webpages, presentations, and other word processing documents).
“Wal-Mart creates 2.5 petabytes of data on consumers every hour, a petabyte of data “is equivalent to 20 million traditional filings cabinets of text.”(Erevelles, Fukawa, & Swayne, 2015)
Big Data can be collected through cookies, heat maps, GPS tracking, signal tracking, in-store WIFI monitoring, credit or loyalty cards, IOT sensors, or facial recognition cameras. Heat maps collect eye tracking and mouse tracking data to measure consumer engagement and behavior trends.
The Five Components of Big Data
1) Data Mining: Unstructured and structured data is gathered from the internal data warehouse or cloud-based server
2) Data Storage: Data is stored and prepared for analysis
3) Data Analysis: Statistical analysis tools are used to analyze data before sharing
4) Data Sharing: Data is shared internally or to third-parties for visualization
5) Data Visualization: Visualization tools are used to interact with the data and bring a presentation to life
Hazards & Limitations of Data Analysis
Although analyzing data can serve to be a powerful tool for broker-dealers and sponsors, certain statistical limitations and hazards should be kept in mind when measuring the validity of the data. Below are some of the hazards that must be considered:
- Incomplete Data: Does the data suffer from gaps that bias the results?
- Data Mining: Does the method of analysis lead to spurious inferences as to causality?
- Data Cleaning: Were invalid data points properly removed from the dataset?
- Analysis of Apparent Outliers: Were outliers examined to determine if the data is valid?
- Assumption of Normality: Did your data roughly fit a Gaussian curve before conducting a statistical analysis?
- Retrospective Bias: Were there exposures to suspected risk or factors of protection that related to the perceived outcome prior to conducting the analysis of the data?
- Ecological Fallacy: Was there confusion between the correlation of the group and the individuals? Were the proper measures installed to clear the confusion between the average of the groups versus the total average?
Why is Big Data Important?
Big Data is revolutionizing information technology and companies are using Big Data in various ways. Coca-Cola uses data from emails, phone calls, purchasing behavior, to promote customer retention and acquisition. Netflix uses algorithms to collect watch data, search history, and ratings to enhance the subscriber’s experience.
In the finance industry, Goldman Sachs uses big data to fuel investment decisions and its research agenda. Fidelity’s Wealthscape collects and manages advisor data to expand workflow efficiency and productivity. The retail banking industry uses Big Data to analyze consumer spending patterns, consumer preferences, transactional activities, fraud prevention patterns, compliance workflows, customer feedback, and much more. The volume, velocity, and variety of data allow broker-dealers and sponsors to uncover hidden insights that may have been overlooked in the past.
“Such data can help broker-dealers target specific support for struggling advisers and pair mentors more effectively.“(Michael Brodeur, Investment News)
By analyzing Big Data, broker-dealers and sponsors can make productive yet cost-effective decisions to increase the workflow efficiency of the firm and their representatives. In the field of Business to Consumer (B2C), Big Data Consumer Analytics plays a massive role in uncovering hidden insights regarding client preferences and behaviors
Big Data Consumer Analytics is “the extraction of hidden insight about consumer behavior from Big Data and the exploitation of that insight through advantageous interpretation.”(Erevelles, Fukawa, & Swayne, 2015)
The concept of Big Data Consumer Analytics will be analyzed further in upcoming blog posts. Stay tuned!
By staying up-to-date with emerging technologies, broker-dealers and sponsors can greatly benefit by increasing their performance and productivity. Satisfying the needs of partners, clients, and customers can be performed efficiently with the use of Big Data. Sponsors or broker-dealers that fail to keep up with the disruptive technologies available today may face difficulty staying competitive in an increasingly client-centric landscape.