Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES|QL columns with additional meta-data for type, units, meaning #108819

Open
craigtaverner opened this issue May 20, 2024 · 1 comment
Open

ES|QL columns with additional meta-data for type, units, meaning #108819

craigtaverner opened this issue May 20, 2024 · 1 comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@craigtaverner
Copy link
Contributor

Description

In ES|QL there are many functions that extract parts of known data-types into primitive types, where the original type is still meaningful, and should be persisted.

For example, calculating a date_diff in some specified units should maintain the knowledge that the resulting Integer is actually a temporal type with known units:

FROM index | EVAL m = date_diff("minutes", event.start, now()) | KEEP m

Currently this query returns a column called m with type integer, and clients have no idea that the integer represents something like minutes, and so cannot, for example, render 120699 as "83 days, 19 hours and 39 minutes" or whatever rendering the client or app desires.

To solve this, the column header in the returned JSON should be updated with an additional meta field, for example:

"columns": [
  {
    "name": "m",
    "type": "integer",
    "meta": {
      "type": "duration",
      "units": "minutes"
    }
  }
}

Further uses of meta-data in columns

This concept can be expanded to many related use-cases:

  • Geo/Spatial units - for example the H3 geohex grid id can be a long or a keyword, but client would like to know that they are H3 cell-ids.
  • Geo/Spatial field extraction: ST_X and ST_Y extract coordinates as double values, but the fact that they are x and y or longitude and latitude is still useful
  • Source of generated columns: for example a number from stats might be useful to know its source (the stats function, grouping key, etc.)
  • When types themselves have meta-data: for geo/spatial types we anticipate adding SRID (spatial reference system ID) to the types, so a geometry column of one projection is seen as different from a geometry column of another projection. This could become important if we start supporting multiple projections in future.

An example showing some of this:

FROM airports
| STATS centroid = ST_CENTROID_AGG(location), airports = COUNT(*) BY country
| EVAL longitude =  ST_X(centroid)
| EVAL latitude = ST_Y(centroid)
| EVAL h3 = H3_lat_lng_to_cell(centroid, 5)
| EVAL h3_cell = TO_STRING(h3)
"columns": [
  {
    "name": "country",
    "type": "keyword"
  },
  {
    "name": "centroid",
    "type": "geo_point",
    "meta": {
      "stats": "ST_CENTROID_AGG(location) BY country",
      "srid": 4326
    }
  },
  {
    "name": "airports",
    "type": "long",
    "meta": {
      "stats": "COUNT(*) BY country"
    }
  },
  {
    "name": "longitude",
    "type": "double",
    "meta": {
      "eval": "ST_X(centroid)",
      "units": "degrees",
      "srid": 4326
    }
  },
  {
    "name": "latitude",
    "type": "double",
    "meta": {
      "eval": "ST_Y(centroid)",
      "units": "degrees",
      "srid": 4326
    }
  },
  {
    "name": "h3",
    "type": "long",
    "meta": {
      "grid": "h3",
      "precision": 5
    }
  },
  {
    "name": "h3_cell",
    "type": "keyword",
    "meta": {
      "grid": "h3",
      "precision": 5
    }
  }
}
@craigtaverner craigtaverner added >enhancement needs:triage Requires assignment of a team area label Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels May 20, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

2 participants