{"id":4866,"date":"2020-04-19T19:48:45","date_gmt":"2020-04-19T19:48:45","guid":{"rendered":"https:\/\/www.testpreptraining.com\/tutorial\/?page_id=4866"},"modified":"2020-04-19T19:53:01","modified_gmt":"2020-04-19T19:53:01","slug":"ai-platform-pipelines-google-professional-data-engineer-gcp","status":"publish","type":"page","link":"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/","title":{"rendered":"AI Platform Pipelines Google Professional Data Engineer GCP"},"content":{"rendered":"<ul>\n<li>makes it easier to get started with MLOps<\/li>\n<li>easily set up Kubeflow Pipelines with TensorFlow Extended (TFX).<\/li>\n<li>Kubeflow Pipelines is an open source platform for running, monitoring, auditing, and managing ML pipelines on Kubernetes.<\/li>\n<li>TFX is an open source project for building ML pipelines that orchestrate end-to-end ML workflows.<\/li>\n<li>ML pipelines are portable, scalable ML workflows<\/li>\n<li>Use ML pipelines to:<\/li>\n<li>Apply MLOps strategies to automate repeatable processes.<\/li>\n<li>Experiment by running an ML workflow with different sets of hyperparameters,<\/li>\n<li>Reuse a pipeline&#8217;s workflow to train a new model.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><strong>Pipeline Components<\/strong><\/p>\n<ul>\n<li>are self-contained sets of code<\/li>\n<li>perform one step in a pipeline&#8217;s workflow like\n<ul>\n<li>data preprocessing<\/li>\n<li>data transformation<\/li>\n<li>model training<\/li>\n<\/ul>\n<\/li>\n<li>composed of a\n<ul>\n<li>set of input parameters<\/li>\n<li>set of outputs<\/li>\n<li>location of a container image<\/li>\n<\/ul>\n<\/li>\n<li>A component&#8217;s container image includes\n<ul>\n<li>component&#8217;s executable code<\/li>\n<li>definition of the environment that the code runs in.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><strong><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-4681 aligncenter\" src=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/04\/Professional-Data-Engineer-Google-Cloud-image022-406x400.png\" alt=\"\" width=\"406\" height=\"400\" srcset=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/04\/Professional-Data-Engineer-Google-Cloud-image022-406x400.png 406w, https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/04\/Professional-Data-Engineer-Google-Cloud-image022.png 518w\" sizes=\"auto, (max-width: 406px) 100vw, 406px\" \/><\/strong><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Understanding pipeline workflow<\/strong><\/p>\n<ul>\n<li>Each task in a pipeline performs a step in the pipeline&#8217;s workflow.<\/li>\n<li>tasks are instances of pipeline components,<\/li>\n<li>tasks have input parameters, outputs, and a container image.<\/li>\n<li>Task input parameters can be set from the pipeline&#8217;s input parameters<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>For example, consider a pipeline with the following tasks:<\/p>\n<p>&nbsp;<\/p>\n<ul>\n<li><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-4682 aligncenter\" src=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/04\/Professional-Data-Engineer-Google-Cloud-image023.png\" alt=\"\" width=\"740\" height=\"330\" \/>Preprocess: prepares the training data.<\/li>\n<li>Train: uses the preprocessed training data to train the model.<\/li>\n<li>Predict: deploys trained model as an ML service and gets predictions for testing dataset.<\/li>\n<li>Confusion matrix: uses output of the prediction task to build a confusion matrix.<\/li>\n<li>ROC: uses the output of the prediction task to perform receiver operating characteristic (ROC) curve analysis.<\/li>\n<\/ul>\n<p>Kubeflow Pipelines SDK analyzes the task dependencies, as<\/p>\n<ul>\n<li>The preprocessing task does not depend on any other tasks<\/li>\n<li>The training task relies on data produced by the preprocessing task, so training must occur after preprocessing.<\/li>\n<li>The prediction task relies on the trained model produced by the training task, so prediction must occur after training.<\/li>\n<li>Building the confusion matrix and performing ROC analysis both rely on the output of the prediction task, so they must occur after prediction is complete.<\/li>\n<li>Hence, system runs the preprocessing, training, and prediction tasks sequentially, and then runs the confusion matrix and ROC tasks concurrently.<\/li>\n<li>With AI Platform Pipelines, you can orchestrate machine learning (ML) workflows as reusable and reproducible pipelines.<\/li>\n<\/ul>\n<p><strong>Building pipelines using the TFX SDK<\/strong><\/p>\n<ul>\n<li>TFX is an open source project to define ML workflow as a pipeline.<\/li>\n<li>TFX components can only train TensorFlow based models.<\/li>\n<li>TFX provides components to\n<ul>\n<li>ingest and transform data<\/li>\n<li>train and evaluate a model<\/li>\n<li>deploy a trained model for inference, etc.<\/li>\n<\/ul>\n<\/li>\n<li>By using the TFX SDK, you can compose a pipeline for ML process from TFX components.<\/li>\n<\/ul>\n<p><strong>Building pipelines using the Kubeflow Pipelines SDK<\/strong><\/p>\n<p>build components and pipelines by<\/p>\n<ul>\n<li>Developing the code for each step in workflow using preferred language and tools<\/li>\n<li>Creating a Docker container image for each step&#8217;s code<\/li>\n<li>Using Python to define pipeline using the Kubeflow Pipelines SDK<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>makes it easier to get started with MLOps easily set up Kubeflow Pipelines with TensorFlow Extended (TFX). Kubeflow Pipelines is an open source platform for running, monitoring, auditing, and managing ML pipelines on Kubernetes. TFX is an open source project for building ML pipelines that orchestrate end-to-end ML workflows. ML pipelines are portable, scalable ML&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":""},"categories":[617],"tags":[729,619,623,726,622,618,621],"class_list":["post-4866","page","type-page","status-publish","hentry","category-google-gcp","tag-ai-platform-pipelines","tag-data-engineer","tag-gcp","tag-google-ai-platform-formerly-cloud-ml-engine","tag-google-certification","tag-google-cloud","tag-professional-data-engineer"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v22.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>AI Platform Pipelines Google Professional Data Engineer GCP - Testprep Training Tutorials<\/title>\n<meta name=\"description\" content=\"Google Cloud Certified Professional Data Engineer Tutorial, dumps, brief notes on AI Platform Pipelines\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"AI Platform Pipelines Google Professional Data Engineer GCP - Testprep Training Tutorials\" \/>\n<meta property=\"og:description\" content=\"Google Cloud Certified Professional Data Engineer Tutorial, dumps, brief notes on AI Platform Pipelines\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/\" \/>\n<meta property=\"og:site_name\" content=\"Testprep Training Tutorials\" \/>\n<meta property=\"article:modified_time\" content=\"2020-04-19T19:53:01+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/04\/Professional-Data-Engineer-Google-Cloud-image022-406x400.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/\",\"url\":\"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/\",\"name\":\"AI Platform Pipelines Google Professional Data Engineer GCP - Testprep Training Tutorials\",\"isPartOf\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#website\"},\"datePublished\":\"2020-04-19T19:48:45+00:00\",\"dateModified\":\"2020-04-19T19:53:01+00:00\",\"description\":\"Google Cloud Certified Professional Data Engineer Tutorial, dumps, brief notes on AI Platform Pipelines\",\"breadcrumb\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.testpreptraining.ai\/tutorial\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AI Platform Pipelines Google Professional Data Engineer GCP\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#website\",\"url\":\"https:\/\/www.testpreptraining.ai\/tutorial\/\",\"name\":\"Testprep Training Tutorials\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.testpreptraining.ai\/tutorial\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#organization\",\"name\":\"Testprep Training\",\"url\":\"https:\/\/www.testpreptraining.ai\/tutorial\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png\",\"contentUrl\":\"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png\",\"width\":583,\"height\":153,\"caption\":\"Testprep Training\"},\"image\":{\"@id\":\"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"AI Platform Pipelines Google Professional Data Engineer GCP - Testprep Training Tutorials","description":"Google Cloud Certified Professional Data Engineer Tutorial, dumps, brief notes on AI Platform Pipelines","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/","og_locale":"en_US","og_type":"article","og_title":"AI Platform Pipelines Google Professional Data Engineer GCP - Testprep Training Tutorials","og_description":"Google Cloud Certified Professional Data Engineer Tutorial, dumps, brief notes on AI Platform Pipelines","og_url":"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/","og_site_name":"Testprep Training Tutorials","article_modified_time":"2020-04-19T19:53:01+00:00","og_image":[{"url":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-content\/uploads\/2020\/04\/Professional-Data-Engineer-Google-Cloud-image022-406x400.png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/","url":"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/","name":"AI Platform Pipelines Google Professional Data Engineer GCP - Testprep Training Tutorials","isPartOf":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#website"},"datePublished":"2020-04-19T19:48:45+00:00","dateModified":"2020-04-19T19:53:01+00:00","description":"Google Cloud Certified Professional Data Engineer Tutorial, dumps, brief notes on AI Platform Pipelines","breadcrumb":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/ai-platform-pipelines-google-professional-data-engineer-gcp\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.testpreptraining.ai\/tutorial\/"},{"@type":"ListItem","position":2,"name":"AI Platform Pipelines Google Professional Data Engineer GCP"}]},{"@type":"WebSite","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#website","url":"https:\/\/www.testpreptraining.ai\/tutorial\/","name":"Testprep Training Tutorials","description":"","publisher":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.testpreptraining.ai\/tutorial\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#organization","name":"Testprep Training","url":"https:\/\/www.testpreptraining.ai\/tutorial\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/","url":"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png","contentUrl":"https:\/\/www.testpreptraining.com\/tutorial\/wp-content\/uploads\/2020\/07\/tpt-logo-6.png","width":583,"height":153,"caption":"Testprep Training"},"image":{"@id":"https:\/\/www.testpreptraining.ai\/tutorial\/#\/schema\/logo\/image\/"}}]}},"_links":{"self":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages\/4866","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/comments?post=4866"}],"version-history":[{"count":3,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages\/4866\/revisions"}],"predecessor-version":[{"id":4887,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/pages\/4866\/revisions\/4887"}],"wp:attachment":[{"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/media?parent=4866"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/categories?post=4866"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.testpreptraining.ai\/tutorial\/wp-json\/wp\/v2\/tags?post=4866"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}