

String Length (Analysis of lengths of values populated in the column) Value (Analysis of values populated in the columns)Īverage- Average value for that column(all records) Below are the explanation of each result.

Reading the results generated from column profiling-Ĭolumn Profiling gives us the statistical analysis of data which can be very useful in understanding the data which exists in the system. You can check the task status in Task section of SAP Information Steward, once task is complete results can be viewed in workspace section. Note: this option should only be used for the columns for which you want to see distribution results as this option takes lot of time to give you the results when you have huge data. If you enter a rate of 2, then every second record of the total records in the table, up to 1000 records, will be profiled, and so on.įilter Condition -: You can add filter condition while creating task also using filter condition option.ĭistribution Option : If you have requirements for checking the distribution of data or if you want to see how the data is spread across the primary columns, you can use the median and distribution and word distribution options. For example, if you chose a Max input size of 1,000 records and you enter a rate of 1, then the first 1000 records will be profiled. Input Sampling Rate– How you want the records chosen. Hit Save and Run Now button to execute the column profiling.Check the boxes w/ types of column profile to perform and leave other values to default.Window shown in screenshot will get pop up. Just select the view/table and hit on Column profiling from the profiling options in workspace section of SAP Information Steward.To perform the column profiling on Table/View: Column profiling can be performed on table/view.This profiling feature help in determining Values, String lengths, Completeness and Distribution across the columns.Column profiling as the name implies, it helps the user in understanding the Data stored within the columns in table’s/view’s.Here are some key points to remember when you are performing column profiling in SAP Information Steward : Now begin with explaining them in detail, I will start with column profiling.Ĭonsider the below data set as an example to explain the column profiling. Uniqueness profiling talks about finding duplicate values and checking uniqueness within the column/s of same table Whereas, Redundancy profiling talks about finding overlapping between the pair of columns of two different tables.Ībove are the different types of profiling techniques available with in SAP Information Steward. Uniqueness Profiling – Returns the percentage and counts of rows that contain unique data for the set of selected columns.Ĭontent Type Profiling – Returns the information about the type of data exist in the columns of table. Redundancy Profiling – Determines the degree of overlapping data values or duplications between two sets of columns. Distribution of distinct words in a columnĭependency Profiling – Identifies attribute-level relationships in the data by finding the values in one or more dependent columns that rely on the value in a primary column.Īddress Profiling – Determines the quality of addresses by determining whether the address is:.Number of distinct values or patterns in a column.

Use profiling to examine data so you can understand its content, structure, and data quality dependencies.Ĭolumn Profiling – Determines the values and characteristics of data elements such as :.It is the process of examining the data available from an existing information source (SAP, Database, File) and collecting statistics or informative summaries about that data.So let’s start by understanding what is Data Profiling? This article will guide you through step by step procedure and will give you the complete idea on usage of column profiling. I will start with overview in this post and will explain the commonly used data profiling technique in SAP Information Steward which is column profiling. This Blog Post will give the overview on Data Profiling technique within SAP Information Steward data quality tool.
