Your data feeds let you make your restaurant, services, and menu available in Ordering End-to-End.
This document covers how to host your sandbox and production inventories and use batch ingestion to update your inventory in Ordering End-to-End.
Data feed environments
There are three data feed environments available for your integration development:
Feed environment | Description | Batch ingestion |
---|---|---|
Sandbox | The test environment for your feed development. | Required |
Production | The production environment for your inventory that you want to launch. | Required |
Hosting data feeds
In order for Ordering End-to-End to process your Sandbox and Production data feeds by batch ingestion, you must host your data feed files in Google Cloud Storage, Amazon S3, or HTTPS with a sitemap.
We recommend that you host the data feeds for your sandbox and production environments separately. This approach lets you do development and testing in your sandbox feed environment before you deploy the changes to production.
For example, if you use Google Cloud Storage as a hosting option, you would have the following paths:
- Sandbox Feed:
gs://foorestaurant-google-feed-sandbox/
- Production Feed:
gs://foorestaurant-google-feed-prod/
To host your inventory, do the following:
- Generate your data feed files.
- Choose a hosting solution.
- Host your data feeds.
- Ensure that your data feed files are updated regularly. Production data feeds must be updated daily.
For details on how to create an inventory feed, see the documentation for the
Restaurant
,
Service
,
and Menu
entities, as well as the
Create a data feed
section.
Guidelines on data feed files
Each file, which can contain multiple entities, must not exceed 200 MB. The top-level
entities Restaurant
, Service
, and Menu
, along with their
child entities, must not exceed 4 MB all together.
Choose a hosting solution
The following table lists the options for hosting your data feeds and how those hosts work with Ordering End-to-End:
Amazon S3 | Google Cloud Storage | HTTPS with a sitemap | |
---|---|---|---|
Credentials and access |
Provide Google with the following information:
The S3 bucket needs to include the following information:
Example |
Provide Google with the paths to your production and sandbox bucket directories and
Add the service account provided by your Google consultant as a reader of your Google Cloud Storage bucket. For more information on how to control access for Google Cloud Storage (GCS), see Google Cloud Platform Console: Setting bucket permissions. The GCS bucket needs to include the following information:
Example |
Provide Google with the following information:
|
How Google knows which files need to be fetched | Directory listing of all files in the bucket. | Directory listing of all files in the bucket. | Individual URLs of files listed in the sitemap. |
How Google knows that files are ready to fetch | After you finish generating your data feeds, update the marker.txt file
with the latest timestamp. |
After you finish generating your data feeds, update the marker.txt file
with the latest timestamp. |
After you finish generating your data feeds, update the response header
last-modified of your sitemap.xml with the latest timestamp. |
File limits |
Maximum number of files: 100,000. You must have less than 100,000 files total in your Amazon S3 bucket. |
Maximum number of files: 100,000. You must have less than 100,000 files total in your Google Cloud Storage bucket. |
Maximum number of files: 100,000. The number of file paths within your sitemap XML file must be less than 100,000. |
Connect your data feeds for batch ingestion
After you host your feeds, you need to connect them to your project on Actions Center. The initial configuration of production feeds is done on the Onboarding Tasks page. Later on the production and sandbox feeds configuration can be updated from the Configuration > Feeds page at any time by any portal users with an administrative role. The sandbox environment is used for development and testing purposes, while the production feeds are displayed to users.
If you host your data feeds with Amazon S3
- In the Actions Center, go to Configuration > Feeds.
-
Click Edit and fill out the Update Feed form:
- Feed delivery method: Set to Amazon S3.
- Marker File: Provide the URL of the
marker.txt
file. - Data Files: Provide the URL to the S3 bucket that contains the data feeds.
- Access ID: Enter the IAM access key ID with permissions to read from S3 resources.
- Access Key: Enter the IAM secret access key with permissions to read from S3 resources.
- Click Submit.
- After one to two hours, check whether batch ingestion fetches your feed files.
If you host your data feeds with Google Cloud Storage
- In the Actions Center, go to Configuration > Feeds.
-
Click Edit and fill out the Update Feed form:
- Feed delivery method: Set to Google Cloud Storage.
- Marker File: Provide the URL of the
marker.txt
file. - Data Files: Provide the URL to the GCS bucket that contains the data feeds.
- Click Submit.
- A service account is created to access your GCS bucket. The account name can be found in Configuration > Feeds after the onboarding tasks are complete. This service account needs the “Storage Legacy Object Reader” role. This role can be granted to the service account in the IAM page of the Google Cloud console.
- After one to two hours, check whether batch ingestion fetches your feed files.
If you host your data feeds with HTTPS
- In the Actions Center, go to Configuration > Feeds.
-
Click Edit and fill out the Update Feed form:
- Feed delivery method: Set to HTTPS.
- Sitemap File: Provide the URL of the
sitemap.xml
file. - Username: Enter the username credentials to access the HTTPS server.
- Password: Enter the password to access the HTTPS server.
- Click Submit.
- After one to two hours, check whether batch ingestion fetches your feed files.
Example paths
The following table contains example paths for each of the hosting options:
Amazon S3 | Google Cloud Storage | HTTPS with a sitemap | |
---|---|---|---|
Path | s3://foorestaurant-google-feed-sandbox/ |
gs://foorestaurant-google-feed-sandbox/ |
https://sandbox-foorestaurant.com/sitemap.xml |
Marker file | s3://foorestaurant-google-feed-sandbox/marker.txt |
gs://foorestaurant-google-feed-sandbox/marker.txt |
Not applicable |
Sitemaps for HTTPS hosting
Use the following guidelines when you define sitemaps:
- Links in your sitemap must point to the files themselves.
- If your sitemap includes references to a cloud provider instead of your own domain name,
ensure that the start of the URLs, like
https://www.yourcloudprovider.com/your_id
, are stable and unique to your batch job. - Be careful not to upload partial sitemaps (like in the event of a partial data upload). Doing so results in Google ingesting only the files in the sitemap, which will cause your inventory levels to drop and might result in your feed ingestion being blocked.
- Ensure that the paths to the files referenced in the sitemap don't change. For example, don't
have your sitemap reference
https://www.yourcloudprovider.com/your_id/10000.json
today but then referencehttps://www.yourcloudprovider.com/your_id/20000.json
tomorrow.
Example sitemap
Here's an example sitemap.xml
file that serves data feed files:
Example 1: Entities grouped by merchants (Recommended).
XML
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://your_fulfillment_url.com/restaurant_1.ndjson</loc> <lastmod>2018-06-11T10:46:43+05:30</lastmod> </url> <url> <loc>https://your_fulfillment_url.com/restaurant_2.ndjson</loc> <lastmod>2018-06-11T10:46:43+05:30</lastmod> </url> <url> <loc>https://your_fulfillment_url.com/restaurant_3.ndjson</loc> <lastmod>2018-06-11T10:46:43+05:30</lastmod> </url> </urlset>
Example 2: Entities grouped by types.
XML
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://your_fulfillment_url.com/restaurant.json</loc> <lastmod>2018-06-11T10:46:43+05:30</lastmod> </url> <url> <loc>https://your_fulfillment_url.com/menu.json</loc> <lastmod>2018-06-11T10:46:43+05:30</lastmod> </url> <url> <loc>https://your_fulfillment_url.com/service.json</loc> <lastmod>2018-06-11T10:46:43+05:30</lastmod> </url> </urlset>
Update your data feeds
After your data feeds are connected, Google checks for updates once each hour, but we only ingest
all data feeds when the marker.txt
or sitemap.xml
files have been
modified. We expect that you update your data feeds once a day to prevent stale inventory.
To specify that the data feeds have been modified and are ready for batch ingestion, update the
last-modified
object metadata field of the marker.txt
file (For GCP and
S3) or the last-modified
response header of the sitemap.xml
file. Google
uses these values to determine how fresh a data feed is.
As the batch feed is being ingested,
- New entities that don't exist in your current Ordering End-to-End inventory and don't have any errors would be inserted.
-
Entities already present in the inventory that don't have any errors on
ingestion and either have a
dateModified
more recent than their current entry or in the case of not haveing adateModified
the feed ingestion start time is more recent than the current entry they would be updated, otherwise they would be marked as stale. - Entities that were part of a previous feed that are no longer included in the batch feed being processed would be deleted, provided there are no file level errors in the feed.
The timestamp or the last-modified
response header must be updated only after all of the data
feed files are generated and updated. Limit the batch jobs that update your data feeds to run only
once a day. Alternatively, have a gap of at least three hours between each batch job. If you don't
take these steps, Google might fetch stale files.