Tag: anomaly detection, azure databricks, azure machine learning, data quality, deduplication, machine learning
Musings on Data Quality
Prasad KulkarniMar 29, 2022
Introduction For a successful Machine Learning or Data Science practice, the following elements are key: Business Case Quality Data Skilled Teams Technology Risk...
Data Profiling in Power BI (using Azure Databricks)
Prasad KulkarniFeb 12, 2022
In Microsoft, there are two worlds i.e. MS Azure and MS Office 365. They are two two different Active Directories in Microsoft world. Hence, they have their own tools to...
Data Profiling options in Azure
Prasad KulkarniDec 28, 2021
The first step of Data Science, after Data Collection, is Exploratory Data Analysis(EDA). However, the first step within EDA is getting a high-level overview of how the...
Motivating DP-100: Designing and Implementing a Data Science Solution on Azure
Prasad KulkarniNov 28, 2021
Data Science has a come a long way. From Jupyter notebooks on a Data Scientists’ laptops, we have moved to complex ML workflows running in cloud infrastructure....
Azure Databricks source in PowerBI
Prasad KulkarniSep 06, 2021
Microsoft PowerBI is a great tool for Data Visualization. It can connect to a variety of sources. However, databases remain a popular data source. But, what if you...
Motivating Databricks Delta in Azure
Prasad KulkarniAug 26, 2021
Exploratory data analysis entails a lot of ad-hoc analysis. To do so, either they have to rely on databases or file systems like data lakes. Now, to analyze these...
Koalas Dataframe plotting powered by Plotly
Prasad KulkarniAug 02, 2021
In Data Science, Exploratory Data Analysis is an essential process. And as they say, a picture is better than thousand words, visual tools play a key role in...
Building Analytical System on Azure Data Lake Gen2
Prasad KulkarniAug 31, 2020
We live in the world of Big Data and Analytics. It’s a fast-changing world with new technologies emerging at a fast pace. This pace has increased considerably with...
Azure Data Lake Gen2 and Azure Databricks
Prasad KulkarniJun 13, 2020
Before Azure Data Lake Gen2 and Azure Databricks, In our previous articles, we elaborated about two aspects of Azure Data Lake Gen2 migration i.e. governance and...
Cumulative Distribution in Azure Databricks using Spark SQL
Prasad KulkarniMay 24, 2020
We can solve every problem in multiple ways. In our previous article, we motivated the need to fit cumulative distributions. Moreover, we demonstrated the same in Azure...