Hi All!
A Data Lake can be defined as a storage repository that holds a vast amount of raw data in its native format until it is needed. My question is, what would be the point of storing all this data if you can’t access it easily? Azure Data Lake store, which is Microsoft’s Platform as a Service (PaaS) implementation of a Data Lake, allows to, not only store vast amount of data, but also allows you to access the information via multiple channels.
The channel the interests me today is the WebHDFS REST APIs; more specifically and the topic of this blog, how to create an OAuth 2.0 application token for 3rd party tools to authenticate via the WebHDFS REST APIs.
OAuth 2.0 is an industry-standard protocol for authorization which, in the context for Azure Data Lake, allows a person or application to authenticate to the Data Lake Store and consume data. The following will show how to create an application within Azure Active Directory and configure the appropriate access permissions.
Prerequisites:
- Azure Data Lake Store resource created. Follow this guide in order to create a new one: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-get-started-portal
- Azure Active Directory. Follow this guide to get you started with Azure Active Directory: https://docs.microsoft.com/en-ca/azure/active-directory/develop/active-directory-howto-tenant
In order to be able to create an OAuth 2.0 token, you will need to register an application within your Azure Active Directory. This can be done by accessing your Active directory in the Azure Portal and perform the following steps:
Creating a new App registration
|
|
Creating a new registration
|
|
Select an API
|
|
There you have it!
Once you’ve completed registration, Azure AD assigns your application a unique client identifier, the Application ID.