Hello again,

The second part of your first steps in data science is to discover data science marketplace to be able to understand the usage of data science in reality.

Data Science marketplace is various, below some key markets and applications:

Security and Fraud Detection: 

  • Big Data System in Abu Dhabi to prevent Terrorism

In Abu Dhabi, top security experts have presented a novel security concept through the development of a big data system to Abu Dhabi Autonomous Systems Investments, Tawazum Company. The big data system would screen the entire data that flows into the databases of government authorities which can then be used to prevent any kind of cybercrime or terrorist activities. These big data systems apply a statistical data model and filter the data accordingly. Australia, US and UK are already using this big data system. Such systems help the government assess the feelings of the population about any kind of a social media issues. There are several opposition groups that use social media to organize protests and terror attacks which can be prevented by introducing this kind of a big data system in UAE.

  • European Government develops POLE Data Model to Store and Record Incidents

The news headline on 3 girls from London travelling to Syria to join ISIS could have been prevented if this model was developed earlier. One of the three girls was in contact with another girl on Twitter, who was known to the authorities for  joining ISIS. A big data solution has been developed that works on the POLE (Person, Object ,Location and Event based) data model for storing and recording suspicious entities and incidents. The recorded people (entities) in the system can be linked to various other events or people many number of times to build a network of associations and keep track of suspicious people. This data can be retrieved and updated quickly in real time.

  • Use of Machine Learning and Analytics to predict Online Fraud

The cyber security arm RSA of the US big data company EMC uses machine learning and advanced big data analytics methodologies to prevent online fraud. They have detected approximately 500, 000 attacks in 8 years – half of which were identified in 2012 alone. RSA’s Israeli operation moved away from the rule based fraud detection system in favour of a more self-improving method that uses data science-led methodologies reinforced by Bayesian inferencing.

Every time any RSA client makes a transaction through online banking option-20 factors are stored in the Anti-Fraud Command Centre (AFCC) database. All these 20 factors are then pooled with 150 fraud risk features where each risk feature is a combination of 2 or more of the recorded 20 factors. For instance, a combination of MAC address and IP address can better predict the fraudulency than just the IP address. All these risk features are combined to form groups with Bayesian predictors depending on the patterns in which they indicate fraudulent activity.

Detica – the data intelligence arm of BAE Systems in UK also implements similar technology to identify any kind of advanced tenacious threats by using various data science technologies which had gone unnoticed earlier.

[Source http://www.kdnuggets.com/2015/12/big-data-science-security-fraud-detection.html]

Social media analytics:


 IBM has a product called Personality Insights which offers a profiling service for companies that would like to know more about their customers. In the case of social media analytics, text mining and parsing are the very important and necessary first step. Social media companies often make their content available through their application programming interface, or API.

Using this API, data scientists can retrieve the data they want. Collecting the social media data is one thing, but manipulating it for analysis purposes is another. A lot of skills and efforts are necessary before attempting to apply analytics methods, although standards like JSON helps.

Disease control:

University of Pennsylvania conducted a study on a predictive relationship between Twitter post content and heart disease. Emotional factors are linked to heart disease. The University of Pennsylvania study identified indicators of emotional distress expressed in words and correlated them to the occurrences of heart disease.

Their study used linguistic analysis techniques as well as various big data analytics techniques to reveal key words of emotion such as hate to be strongly correlated to the incidence of heart disease. On the other hand, positive words like wonderful showed the opposite correlation. The Twitter data they collected consisted of tweets posted by 88 percent of the people from countries in 2009 and 2010.

Recommender Systems:

Recommender systems are among the most fun and profitable applications of data science in the big data world. Training data (corresponding to the historical search, browse, purchase, and customer feedback patterns of your customers) can be converted into golden opportunities for ROI (i.e.,Return On Innovation and Investment). The predictive analytics tools of data science yield a bonanza of mechanisms to engage your customers and enrich their customer experience. What better loyalty program can there be if not the one that offers the customer what they want before they ask (and sometimes, even before they think of it for themselves). Yes, we know of some cases that have gone bad (such as the secretly pregnant teen and the targeted coupons that Target sent to her father), and we recognize that there is a fine line between being intimate with your customers versus being intimidating, but usually people do like to receive offers for great products that they love.

Source [http://www.datasciencecentral.com/profiles/blogs/recommender-systems-past-present-and-future.

I strongly recommend reading the below article which provides you how the data science became exist in different industries


See you in next Article [ The skills needed to become data scientist ], enjoy 🙂