Before you begin

This document describes the prerequisites, best practices, and common errors when working with datasets.

Prerequisites

When creating a dataset:

  • Display names must be unique within your Google Cloud project.
  • Display names must be less than 64 bytes (Because these characters are represented in UTF-8, in some languages each character can be represented by multiple bytes).
  • Descriptions must be less than 1000 bytes.

When uploading data:

  • The supported file types are CSV, GeoJSON, and KML.
  • The maximum supported file size is 500 MB.
  • Attribute column names cannot begin with the string "?_".
  • Three-dimensional geometries are not supported. This includes the "Z" suffix in the WKT format, and the altitude coordinate in the GeoJSON format.

Data preparation best practices

If your source data is complex or large, such as dense points, long linestrings or polygons (often source file sizes larger than 50 MB fall into this category), consider simplifying your data before uploading to achieve the best performance in a visual map.

Here are some best practices for preparing your data:

  1. Minimize feature properties. Only keep feature properties needed to style your map, for example "id" and "category". You can join additional properties to a feature in a client application using data-driven styles on a unique identifier key. For example, see See your data in real time with Data-driven styling.
  2. Use simple data types for property objects where possible, such as integers, to minimize tile size and improve map performance.
  3. Simplify complex geometries prior to uploading a file. You can do this in a geospatial tool of your choice, such as the open source Mapshaper.org utility, or in BigQuery using ST_Simplify on complex polygon geometries.
  4. Cluster very dense points prior to uploading a file. You can do this in a geospatial tool of your choice, such as the open source turf.js cluster functions, or in BigQuery using ST_CLUSTERDBSCAN on dense point geometries.

See additional guidance about datasets best practices in Visualize your data with Datasets and BigQuery.

GeoJSON requirements

Maps Datasets API supports the current GeoJSON specification. Maps Datasets API also support GeoJSON files that contain any of the following object types:

  • Geometry objects. A geometry object is a spatial shape, described as a union of points, lines, and polygons with optional holes.
  • Feature objects. A feature object contains a geometry plus additional name/value pairs, whose meaning is application-specific.
  • Feature collections. A feature collection is a set of feature objects.

Maps Datasets API does not support GeoJSON files that have data in a coordinate reference system (CRS) other than WGS84.

For more information on GeoJSON, see RFC 7946 compliant.

KML requirements

Maps Datasets API has the following requirements:

  • All URLs must be local (or relative) to the file itself.
  • Point, line, and polygon geometries supported.
  • All data attributes are considered strings.
The following KML features are not supported:
  • Icons or <styleUrl> defined outside of the file.
  • Network links, such as <NetworkLink>
  • Ground overlays, such as <GroundOverlay>
  • 3D geometries or any altitude-related tags such as <altitudeMode>
  • Camera specifications such as <LookAt>
  • Styles defined inside the KML file.

CSV requirements

For CSV files, the supported column names are listed below in order of priority:

  • latitude, longitude
  • lat, long
  • x, y
  • wkt (Well-Known Text)
  • address, city, state, zip
  • address
  • A single column containing all address information, such as 1600 Amphitheatre Parkway Mountain View, CA 94043

For example, your file contains columns named x, y, and wkt. Because x and y have a higher priority, as determined by the order of supported column names in the list above, the values in the x and y columns are used and the wkt column is ignored.

In addition:

  • Each column name must belong to a single column. That is, you cannot have a column named xy that contains both x and y coordinate data. The x and y coordinates must be in separate columns.
  • Column names are case-insensitive.
  • The order of the column names does not matter. For example, if your CSV file contains lat and long columns, they can occur in any order.

Handle data upload errors

When uploading data to a dataset, you might experience one of the common errors described in this section.

GeoJSON errors

Common GeoJSON errors include:

  • Missing type field, or the type is not a string. The uploaded GeoJSON data file must contain a string field named type as part of each Feature object and Geometry object definition.

KML errors

Common KML errors include:

  • The data file must not contain any of the unsupported KML features listed above, otherwise the data import might fail.

CSV errors

Common CSV errors include:

  • Some rows are missing values for a geometry column. All rows in a CSV file must contain non-empty values for the geometry columns. The geometry columns include:
    • latitude, longitude
    • lat, long
    • x, y
    • wkt
    • address, city, state, zip
    • address
    • A single column containing all address information, such as 1600 Amphitheatre Parkway Mountain View, CA 94043
  • If x and y are your geometry columns, ensure that the units are longitude and latitude. Some public datasets use different coordinate systems under the headers x and y. If the wrong units are used, the dataset might import successfully, but the rendered data can show the dataset points in unexpected locations.