IBM - Modern Data Warehouse Example

Let’s see top management viewpoint on modern data warehouse

Lots of technologists are promoting replacement of data warehouse by data lake.  In the view point of our data science team, a modernized data warehouse should carry its own value for the analytics nowadays.

There is an article by sharing the viewpoint of CEO of Yellowbrick – Neil Carson what vital technology and tools being used.

Original Article

His opinion is aligning with our data science evangelist – Samuel Sum on Hadoop.  It is a large-scale data store but it is not easy to access and manage in many situations.  It is always better to have a structured data store like data warehouse for easy user access.  Also, a key-point of a successful data analytic environment is the storage speed storing the data.  With SSD (flash memory), the data warehouse (both ETL and access) are now several times faster than before.

Finally, the editor of this webpage suggests read an article of Samuel Sum – talking about the Data Lake (Data Lake VS Data Warehouse).

IBM - Modern Data Warehouse Example

machine learning

Machine Learning – some vital keys

In this week, we would like to share an article with highlighting the keys of success for Machine Learning.

For the original article, the author(s) had list out the keys of success for machine learning:

  1. Start small with Machine Learning – this is similar to all other data analytic projects to have a smaller scoped for better management
  2. Machine Learning Must Have Data Quality to Succeed – data is the most important ingredient for data analytic; so, the quality is critical.
  3. No Universal Machine Learning Algorithms – Machine Learning itself is the approach to solve one single “specific” problem and the related algorithms should be unique by the corresponding use-case.

Original Article at

machine learning

Identity of you

Where you go tells who you are

Sharon Di, assistant professor of civil engineering and engineering mechanics at Columbia Engineering, has discovered the patterns of traveling highly related to the types of people.  The research is based on data collected by University of Michigan Transportation Institute (UMTRI) with 349 vehicles’ continuous one-year mobile traces (19,130 travel activities).

  • Seniors, who travel to a wider variety of places in a day
  • Workers, who stay mostly at work or at home
  • Parents, who visit more individual places in a day

News Shared by


Recently, the mainland China technology giant Tencent has published their big data research report on the usage pattern on WeChat (similar to WhatsApp mobile App).  They have found that people born in 90s most stressful.  On the other hand, people born in 70s are those with most leisure time.

In our opinion, the findings should be valuable for planning in the society.  However, the privacy should be maintained and only “masked” identity should be used for analysis.

Identity of you

Big Data & Analytics

Data Science Trend 2019

As 2019 is just started, it is time to share different experts’ viewpoints on the trends in data science.


It’s very interesting that they are saying something “not too surprised” and many of them are running in reality.  For example, large corporation management like Oracle is still talking about the Artificial Intelligence (AI) and Machine Learning (ML) with the interview with

Article by


However, there is another article trying to consolidate different sources to see any common ground about the data science trends in 2019.

Article by

In this article, more different areas are being covered such as Virtual Reality and Information Security.


To sum up, it is more mature to have more solutions by the support of data science.  We are moving from data analytics to intelligent automation.

Big Data & Analytics

data warehouse

Data Lake VS Data Warehouse

Our team leader / founder – Samuel Sum has written an article on his blog about Data Lake and Data Warehouse.  There are lots of people trying to drop their data warehouse.  However, Samuel is providing his viewpoint on the value of the data warehouse.  Also, his suggestion on data lake architecture is being discussed by the real-world experiences with our professional service team.

Megadeals VS Data Analytics

Suggested Article by McKinsey about Megadeals and Data & Analytics

We are always visiting our clients to discuss the value of data analytics.  One of the area is being argued about data analytic failed to provide insights for exceptional big deal in the B2B business.   The real case situation is being applied in a consulting business like our team with 1 to 3 mega deals in millions scale contributing up to 30% to 40% revenue in total.

Most people think that data analytics should only be good in “regular” deals rather than “mega-deal(s)” due to the data availability.   Nevertheless, we have won some projects by lots of research and development based on our own KPI knowledgebase of KPI, open data and data from the Census and Statistics Department.

Original Viewpoint from McKinsey:

The article of McKinsey shares their viewpoint for high quality “small” data could be the key of making Megadeal.

Data Mining

Considerations on Data Mining & Predictive Analytics

In this week, we are sharing another article on what we should care about data mining and/or predictive analytics.  For business, it is aimed to improve competitive advantages against peers in the some market.  In this article, the writer shares the fundamentals to be considered in the investment on analytics.

Original Article by Vikash Kumar:

We do believe that a basic understanding on the core of data analytics should be important for everyone in the world nowadays.

Data Mining


Data Science & Personal Data Protection

Another day, there is another sharing of an article worth to read.  In this week, we would like to highlight the importance of privacy when doing data collections.  Even with GDPR is now applied to EU countries, but there is still room of improvement for handling data privacy.  There are lots of data analysts and data scientists’ collecting too much details including unnecessary personal data for their projects.

Article from ITPRO (

They are sharing the bad example of Microsoft for collecting personal information from Office 365.

As experienced data team, we are putting “ethical data science” as the first priority.  Masking personal information and anonymous data should be enough for most cases of data analytics.  Therefore, institutions and governments should always refine guidelines and rules on data collection to strengthen personal data protection aligned to the technology development.


ShamShuiPo - Gentrification

Identify Gentrification and Prediction on Demand Changes

There is an article in the talking about the gentrification.  In US, the original term refers to higher income (white people) moving their homes and businesses into low-income minority neighborhoods.  However, similar situation is found developed cities like Hong Kong, you can find professionals or middle class moving into Shamshuipo (a district with lowest average household income) due to Urban renewals with newly introduced tall residential buildings.  It changes of the business environment.  You couldn’t find any café 10 years ago in this area.  However, there are 5 different “luxury” café in the district about 1047 heatares.

Original Article:

For facing the dynamics of a city, data is very important for businesses to identify the market trends in home building and estimating the demand for commercial space with the categorization of activities.  With the data science and data analytics, it helps to explore more possibilities with insights & prediction from data.

ShamShuiPo - Gentrification