{"id":18527,"date":"2020-08-28T12:51:22","date_gmt":"2020-08-28T12:51:22","guid":{"rendered":"https:\/\/www.testpreptraining.com\/tutorial\/?page_id=18527"},"modified":"2020-08-28T12:53:47","modified_gmt":"2020-08-28T12:53:47","slug":"implementing-a-solution-using-data-lake-storage-gen2","status":"publish","type":"page","link":"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/","title":{"rendered":"Implementing a solution using Data Lake Storage Gen2"},"content":{"rendered":"\n<p><a href=\"https:\/\/www.testpreptraining.ai\/tutorial\/exam-dp-200-implementing-an-azure-data-solution\/\" target=\"_blank\" rel=\"noreferrer noopener\">Go back to DP-200 Tutorials<\/a><\/p>\n\n\n\n<p>In this tutorial, we will learn and understand about loading data into Azure Data Lake Storage Gen2 with Azure Data Factory.<\/p>\n\n\n\n<p>Azure Data Lake Storage Gen2 refers to a set of capabilities dedicated to big data analytics, built into Azure Blob storage. It provides access to interface with your data using both file systems and objects storage paradigms.<\/p>\n\n\n\n<p>On the other hand, Azure Data Factory (ADF) refers to a fully managed cloud-based data integration service. Moreover, this service can populate the lake with data from a rich set of on-premises and cloud-based data stores and save time when building your analytics solutions.<\/p>\n\n\n\n<p>Lastly, Azure Data Factory provides a scale-out, managed data movement solution. Because of the scale-out architecture of ADF, it can ingest data at a high throughput.&nbsp;<\/p>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Prerequisites<\/strong><\/h6>\n\n\n\n<ul class=\"wp-block-list\"><li>Firstly, you must have an Azure subscription.<\/li><li>Secondly, an Azure Storage account with Data Lake Storage Gen2 enabled.<\/li><li>Thirdly, an AWS account with an S3 bucket that contains data.<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Creating a data factory<\/strong><\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Firstly, on the left menu, select Create a resource &gt; Data + Analytics &gt; Data Factory.<\/li><li>Secondly, in the New data factory page, provide values for following fields:<\/li><\/ul>\n\n\n\n<ol class=\"wp-block-list\"><li>Firstly, name in which enter a globally unique name for your Azure data factory. However, if you receive the error &#8220;Data factory name YourDataFactoryName is not available&#8221;, then enter a different name for the data factory.<\/li><li>Secondly, subscription, where you select your Azure subscription in which to create the data factory.<\/li><li>Thirdly, resource Group, where you select an existing resource group from the drop-down list, or select the Create new option and enter the name of a resource group. Then, Select version V2.<\/li><li>Lastly, location, in which select the location for the data factory.&nbsp;<\/li><\/ol>\n\n\n\n<ul class=\"wp-block-list\"><li>Thirdly, select Create.<\/li><li>After completing the creation, go to your data factory. There you see the Data Factory home page as shown in the following image:<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-docs-1.png\" alt=\"creating Data lake storage\" class=\"wp-image-18537\" width=\"768\" height=\"516\" srcset=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-docs-1.png 884w, https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-docs-1-595x400.png 595w\" sizes=\"auto, (max-width: 768px) 100vw, 768px\" \/><figcaption><strong>Image Source: Microsoft<\/strong><\/figcaption><\/figure><\/div>\n\n\n\n<ul class=\"wp-block-list\"><li>Lastly, select the Author &amp; Monitor tile to launch the Data Integration Application in a separate tab.<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><a href=\"https:\/\/www.testpreptraining.ai\/implementing-an-azure-data-solution-dp-200-free-practice-test\" target=\"_blank\" rel=\"noopener noreferrer\"><img loading=\"lazy\" decoding=\"async\" width=\"961\" height=\"150\" src=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-pracice-tests-1.png\" alt=\"DP-200 practice tests\" class=\"wp-image-18535\" srcset=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-pracice-tests-1.png 961w, https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-pracice-tests-1-750x117.png 750w\" sizes=\"auto, (max-width: 961px) 100vw, 961px\" \/><\/a><\/figure><\/div>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Loading data into Azure Data Lake Storage Gen2<\/strong><\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Firstly, in the <strong>Get started page<\/strong>, select the <strong>Copy Data<\/strong> tile to launch the <strong>Copy Data tool.<\/strong><\/li><li>Secondly, in the <strong>Properties page<\/strong>, specify <strong>CopyFromAmazonS3ToADLS<\/strong> for the <strong>Task name<\/strong> field, and select<strong> Next.<\/strong><\/li><li>Thirdly, in the Source data store page, click <strong>+ Create new connection<\/strong>. Then, select <strong>Amazon S3<\/strong> from the connector gallery, and select <strong>Continue.<\/strong><\/li><li>Fourthly, in the <strong>New linked service (Amazon S3) page<\/strong>, do the following steps:<\/li><\/ul>\n\n\n\n<ol class=\"wp-block-list\"><li>Firstly, specify the <strong>Access Key ID value.<\/strong><\/li><li>Then, specify the <strong>Secret Access Key value.<\/strong><\/li><li>Lastly, click<strong> Test connection<\/strong> to validate the settings, then select <strong>Create. <\/strong>And, you will see a new AmazonS3 connection get created. Select <strong>Next.<\/strong><\/li><\/ol>\n\n\n\n<ul class=\"wp-block-list\"><li>Fifthly, in the<strong> Choose the input file or folder<\/strong> page, browse to the folder and file that you want to copy over. Select the folder\/file, and then select <strong>Choose.<\/strong><\/li><li>After that, specify the copy behavior by checking the <strong>Recursively and Binary copy<\/strong> options. Select <strong>Next.<\/strong><\/li><li>Then, in the <strong>Destination data store<\/strong> page, click <strong>+ Create new connection<\/strong>. And, select <strong>Azure Data Lake Storage Gen2<\/strong>, and select <strong>Continue.<\/strong><\/li><li>Next, in the <strong>New linked service (Azure Data Lake Storage Gen2)<\/strong> page,&nbsp;<\/li><\/ul>\n\n\n\n<h6 class=\"wp-block-heading\">Do the following steps:<\/h6>\n\n\n\n<ol class=\"wp-block-list\"><li>Firstly, select your Data Lake Storage Gen2 capable account from the &#8220;Storage account name&#8221; drop-down list.<\/li><li>Then, select Create to create the connection. Then select Next.<\/li><\/ol>\n\n\n\n<ul class=\"wp-block-list\"><li>After that, in the Choose the output file or folder page, enter copyfroms3 as the output folder name, and select Next. However, ADF will create the corresponding ADLS Gen2 file system and subfolders during copy if it doesn&#8217;t exist.<\/li><li>Then, in the Settings page, select Next to use the default settings.<\/li><li>Now, in the Summary page, review the settings, and select Next.<\/li><li>Next, on the Deployment page, select Monitor for monitoring the pipeline (task).<\/li><li>However when the pipeline run completes successfully, you see a pipeline run that is triggered by a manual trigger. There you can use links under the PIPELINE NAME column to view activity details and to rerun the pipeline.<\/li><li>For seeing activity runs associated with the pipeline run, just select the CopyFromAmazonS3ToADLS link under the PIPELINE NAME column. Further, for details about the copy operation, select the Details link (eyeglasses icon) under the ACTIVITY NAME column. Moreover, you can monitor details like the volume of data copied from the source to the sink, data throughput, execution steps with corresponding duration, and used configuration.<\/li><li>For refreshing the view, select Refresh. And, then, select All pipeline runs at the top to go back to the Pipeline Runs view.<\/li><li>Lastly, verify that the data is copied into your Data Lake Storage Gen2 account.<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><a href=\"https:\/\/www.testpreptraining.ai\/implementing-an-azure-data-solution-dp-200-practice-exam\" target=\"_blank\" rel=\"noopener noreferrer\"><img loading=\"lazy\" decoding=\"async\" width=\"961\" height=\"150\" src=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-online-course-1.png\" alt=\"DP-200 Online course\" class=\"wp-image-18534\" srcset=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-online-course-1.png 961w, https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-online-course-1-750x117.png 750w\" sizes=\"auto, (max-width: 961px) 100vw, 961px\" \/><\/a><\/figure><\/div>\n\n\n\n<p class=\"has-text-align-right\"><strong>Reference: <\/strong><a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/data-factory\/load-azure-data-lake-storage-gen2\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft Documentation<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/www.testpreptraining.ai\/tutorial\/exam-dp-200-implementing-an-azure-data-solution\/\" target=\"_blank\" rel=\"noreferrer noopener\">Go back to DP-200 Tutorials<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Go back to DP-200 Tutorials In this tutorial, we will learn and understand about loading data into Azure Data Lake Storage Gen2 with Azure Data Factory. Azure Data Lake Storage Gen2 refers to a set of capabilities dedicated to big data analytics, built into Azure Blob storage. It provides access to interface with your data&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":""},"categories":[],"tags":[],"class_list":["post-18527","page","type-page","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v22.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Implementing a solution using Data Lake Storage Gen2 | Microsoft DP-200<\/title>\n<meta name=\"description\" content=\"Enhance your skills by learning about Implementing a solution using Data Lake Storage Gen2 using Microsoft DP-200 online course Now!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Implementing a solution using Data Lake Storage Gen2 | Microsoft DP-200\" \/>\n<meta property=\"og:description\" content=\"Enhance your skills by learning about Implementing a solution using Data Lake Storage Gen2 using Microsoft DP-200 online course Now!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/\" \/>\n<meta property=\"og:site_name\" content=\"Testprep Training Tutorials\" \/>\n<meta property=\"article:modified_time\" content=\"2020-08-28T12:53:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-docs-1.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/\",\"url\":\"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/\",\"name\":\"Implementing a solution using Data Lake Storage Gen2 | Microsoft DP-200\",\"isPartOf\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#website\"},\"datePublished\":\"2020-08-28T12:51:22+00:00\",\"dateModified\":\"2020-08-28T12:53:47+00:00\",\"description\":\"Enhance your skills by learning about Implementing a solution using Data Lake Storage Gen2 using Microsoft DP-200 online course Now!\",\"breadcrumb\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.testpreptraining.ai\/tutorial\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Implementing a solution using Data Lake Storage Gen2\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#website\",\"url\":\"https:\/\/www.testpreptraining.ai\/tutorial\/\",\"name\":\"Testprep Training Tutorials\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.testpreptraining.ai\/tutorial\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#organization\",\"name\":\"Testprep Training\",\"url\":\"https:\/\/www.testpreptraining.ai\/tutorial\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png\",\"contentUrl\":\"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png\",\"width\":583,\"height\":153,\"caption\":\"Testprep Training\"},\"image\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Implementing a solution using Data Lake Storage Gen2 | Microsoft DP-200","description":"Enhance your skills by learning about Implementing a solution using Data Lake Storage Gen2 using Microsoft DP-200 online course Now!","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/","og_locale":"en_US","og_type":"article","og_title":"Implementing a solution using Data Lake Storage Gen2 | Microsoft DP-200","og_description":"Enhance your skills by learning about Implementing a solution using Data Lake Storage Gen2 using Microsoft DP-200 online course Now!","og_url":"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/","og_site_name":"Testprep Training Tutorials","article_modified_time":"2020-08-28T12:53:47+00:00","og_image":[{"url":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/08\/dp-200-docs-1.png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/","url":"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/","name":"Implementing a solution using Data Lake Storage Gen2 | Microsoft DP-200","isPartOf":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#website"},"datePublished":"2020-08-28T12:51:22+00:00","dateModified":"2020-08-28T12:53:47+00:00","description":"Enhance your skills by learning about Implementing a solution using Data Lake Storage Gen2 using Microsoft DP-200 online course Now!","breadcrumb":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/implementing-a-solution-using-data-lake-storage-gen2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.testpreptraining.ai\/tutorial\/"},{"@type":"ListItem","position":2,"name":"Implementing a solution using Data Lake Storage Gen2"}]},{"@type":"WebSite","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#website","url":"https:\/\/www.testpreptraining.ai\/tutorial\/","name":"Testprep Training Tutorials","description":"","publisher":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.testpreptraining.ai\/tutorial\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#organization","name":"Testprep Training","url":"https:\/\/www.testpreptraining.ai\/tutorial\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/","url":"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png","contentUrl":"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png","width":583,"height":153,"caption":"Testprep Training"},"image":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/"}}]}},"_links":{"self":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages\/18527","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/comments?post=18527"}],"version-history":[{"count":2,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages\/18527\/revisions"}],"predecessor-version":[{"id":18545,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages\/18527\/revisions\/18545"}],"wp:attachment":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/media?parent=18527"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/categories?post=18527"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/tags?post=18527"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}