Objective
As a developer, you often work with datasets containing customer addresses which may not be of good quality. You need to ensure that addresses are correct for use cases ranging from customer ID verification, to delivery, and more.
The Address Validation API is a product from Google Maps Platform that you can use to validate an address. However, it only processes one address at a time. In this document, we will look into how to use the High Volume Address Validation under different scenarios, from API testing to one-time and recurring address validation.
Use cases
Now we will understand the use cases where High Volume Address Validation is useful.
Testing
You often want to test the Address Validation API by running thousands of addresses. You might have the addresses in a Comma Separated Value file and want to validate the quality of the addresses.
One-time validation of addresses
While onboarding to the Address Validation API, you want to validate your existing address database against the user database.
Recurring validation of addresses
A number of scenarios call for validating addresses on a recurring basis:
- You may have scheduled jobs to validate addresses for details captured during the day for example, from customer signups, order details, delivery schedules.
- You may receive data dumps containing addresses from different departments, for example, from sales to marketing. The new department receiving the addresses often wants to validate them before using.
- You might collect addresses during surveys, or various promotions and later on update in the online system. You would like to validate the addresses are correct while inputting them in the system.
Technical deep dive
For the purposes of this document, we assume that:
- You are calling the Address Validation API with addresses from a customer database (i.e. a database with customer details)
- You can cache validity flags against individual addresses in your database.
- Validity flags are retrieved from the Address Validation API when an individual customer logs in.
Cache for production use
When using Address Validation API, you often want to cache some part of the response from the API call. While our Terms of Service limit what data can be cached, any data that can be cached from Address Validation API must be cached against a user account. This means that in the database, the address, or address metadata must be cached against a user's email address or other primary ID.
For the High Volume Address Validation use case, data caching must follow the Address Validation API Service Specific Terms, outlined in Section 11.3. Based on this information, you will be able to determine whether a user's address may be invalid, in which case you will prompt the user for a corrected address during their next interaction with your application.
- Data from the AddressComponent
object
confirmationLevel
inferred
spellCorrected
replaced
unexpected
If you want to cache any information about the actual address, then that data must be cached only with the user's consent. This ensures that the user is well aware why a particular service is storing their address and they are OK with the terms of sharing their address.
An example of user consent would be direct interaction with an ecommerce address form on a checkout page. There is an understanding that you will cache and process the address for the purposes of shipping a package.
With user's consent, you can cache formattedAddress
and other key components
from the response. However, in a headless scenario, a user cannot provide
consent since the address validation is happening from the backend. Therefore,
you can cache very limited information in this headless scenario.
Understand the response
If the Address Validation API response contains the following markers, then you can be confident the input address is of deliverable quality:
- The
addressComplete
marker in the Verdict object istrue
, - The
validationGranularity
in the Verdict object isPREMISE
orSUB_PREMISE
- None of the AddressComponent
are marked as:
Inferred
(Note: inferred=true
can happen whenaddressComplete=true
)spellCorrected
replaced
unexpected
, and
confirmationLevel
: The confirmation level on the AddressComponent is set toCONFIRMED
orUNCONFIRMED_BUT_PLAUSIBLE
If the API response does not contain the above markers, then the input address was likely of poor quality, and you can cache flags in your database to reflect that. Cached flags indicate that the address as a whole is poor quality, while more detailed flags such as Spell Corrected indicate the specific type of address quality issue. On the next customer interaction with an address flagged as poor quality you can call the Address Validation API with the existing address. The Address Validation API will return the corrected address which you can display using a UI prompt. Once the customer accepts the formatted address you can cache the following from the response:
formattedAddress
postalAddress
addressComponent componentNames
orUspsData standardizedAddress
Implement a headless Address validation
Based on the discussion above:
- It is often necessary to cache some part of the response from the Address Validation API for business reasons.
- However the Terms of Service in Google Maps Platform restricts what data can be cached.
In the following section, we will discuss a two step process on how to conform to the Terms of Service and implement high volume address validation.
Step 1:
In the first step we will look into how to implement a high volume address validation script from an existing data pipeline. This process will allow you to store specific fields from the Address Validation API response in a Terms of Service compliant way.
Diagram A: The following diagram shows how a data pipeline can be enhanced with a High Volume Address Validation logic.
According to the Terms of Service, you can cache the following data from the
addressComponent
:
confirmationLevel
inferred
spellCorrected
replaced
unexpected
Thus during this step of the implementation we will cache the above mentioned fields against the UserID.
For more information see details on the actual data structure.
Step 2:
In step 1, we collected feedback that some addresses in the input dataset may not be of high quality. In the next step, we will take these flagged addresses and present them to the user and get their consent to correct the stored address.
Diagram B: This diagram shows how an end to end integration of the user consent flow could look like:
- When the user logs in, first check if you have cached any validation flags in your system.
- If there are flags, you should present the user with a UI to correct and update their address.
- You can call the Address Validation API again with the updated or cached address and present the corrected address to the user to confirm.
- If the address is of good quality, the Address Validation API returns a
formattedAddress
. - You can either present that address to the user if corrections have been made, or silently accept if there are no corrections.
- Once the user accepts, you can cache the
formattedAddress
in the database.
Conclusion
High Volume Address Validation is a common use case you are likely to encounter in many applications. This document attempts to demonstrate some scenarios and a design pattern on how to implement such a solution conforming to Google Maps Platform Terms of service.
We have further written a reference implementation of High Volume Address Validation as an open source library on GitHub. Check it out to get started building with High Volume Address Validation quickly. Also visit the article on design patterns of how to use the library in different scenarios.
Next Steps
Download the Improve checkout, delivery, and operations with reliable addresses Whitepaper and view the Improving checkout, delivery, and operations with Address Validation Webinar.
Suggested further reading:
- Applications of High Volume Address Validation
- Python library on github
- Explore the demo of Address Validation
Contributors
Google maintains this article. The following contributors originally wrote it.
Principal authors:
Henrik Valve | Solutions Engineer
Thomas Anglaret | Solutions
Engineer
Sarthak Ganguly | Solutions
Engineer