1.2 TransformationĪfter data extraction in the ETL, the second stage of the ETL process is transformation, which is when the data is transformed to meet the schema and requirements of the destination. So now, when you want to start transforming your data, you can fetch records from the external file instead of access the source directly. Here, data is extracted directly from source systems, the Extraction process connects directly to source systems and there’s no need for any external file, So that we called it online.ĭata isn’t extracted directly from source systems, first, it’s copied to an external file, then our extraction process connects to that external file and starts processing. There are two kinds of physical extraction: Incremental extraction depends on that source system can give us an update notification when update or add new data to the source, and describe changed or added data. Incremental extraction keeps track of updated data in source systems since the last successful extraction To extract and load only new or changed parts not the whole data like Full extraction, We keep track of updated data using last changed timestamp in source systems, So in Incremental extraction, the extraction tool should recognize new or updated data using time of adding or updating. full extraction also used when the system can’t identify which data is updated, in this situation, We get a full copy of the latest extraction, then start identifying changes. There are two kinds of logical extraction:įull extraction goes for this logic when data is extracted and loaded for the first time, in this type, data from the source is extracted completely, So extracted data reflects all the data currently available on the source system. There are two types of extraction: logical extraction and physical extraction, each of them has other types inside it, so let us demonstrate them. The first step of the ETL process, Extraction is to collect data from multiple targeted sources, Extraction is the most complicated task in the ETL process, Because many sources are in a way that lacks the quality or quantity required (unsatisfactorily), and Determining the eligibility for extraction is not an easy process.Įxtraction needs a lot of work during the research phase, because before doing anything you should understand your data correctly, and it’s a continuous process, The data has to be extracted normally not only once, but several times in a periodic manner to supply all changed data to the warehouse and keep it up-to-date.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |