Query Components¶

Datasources¶

data source는 table, union, query 세종류로 사용할 수 있음

Table Data Source¶

{
    "type": "table",
    "name": "<datasource_name>"
}

Union Data Source¶

Union의 같은 경우 사용하려면 Broker 나 Router에게 요청해야함

데이터 소스들은 동일한 스키마를 가져야함

{
    "type": "union",
    "dataSources": ["<datasource_name1>", "<datasource_name2>"]
}

Query Data Source¶

groupBy Query만 사용가능

nested groupBy 를 위한

{
    "type": "query",
    "query": {
        "type": "groupBy",
        ...
    }
}

DimensionSpec¶

Query에서 dimension을 바꿀 수 있음

Default DimensionSpec¶

{
  "type" : "default",
  "dimension" : <dimension>,
  "outputName": <output_name>,
  "outputType": <"STRING"|"LONG"|"FLOAT"> // STRING default
}

Extraction DimensionSpec¶

extraction_function을 사용해서 Dimension을 바꿀 수 있음

{
  "type" : "extraction",
  "dimension" : <dimension>,
  "outputName" :  <output_name>,
  "outputType": <"STRING"|"LONG"|"FLOAT">,
  "extractionFn" : <extraction_function>
}

Filtered DimensionSpecs¶

Lookup DimensionSpecs¶

Output Types¶

Regular Expression¶

Regular Expression
Partial
Search Query
Substring
Strlen
Time Format
Time Parsing
Javascript
Lookup
Registered Lookup
Cascade
String Format
Upper and Lower
Bucket

Filtered DimensionSpec¶

multi-value dimensions에 대해서 filter를 걸 수 있음

직접 value를 넣는 listFiltered와 정규표현식을 사용하는 regexFiltered 가 있음

{
    "type" : "listFiltered",
    "delegate" : <dimensionSpec>,
    "values": <array of strings>,
    "isWhitelist": <optional attribute for true/false, default is true> }

{
    "type" : "regexFiltered",
    "delegate" : <dimensionSpec>,
    "pattern": <java regex pattern>
}

Lookup DimensionSpec¶

replaceMissingValueWith는 ""로 설정하면 null로 설정하는 것과 같음

default는 replaceMissingValueWith는 null, retainMissingValue는 false

{
  "type":"lookup",
  "dimension":"dimensionName",
  "outputName":"dimensionOutputName",
  "replaceMissingValueWith":"missing_value",
  "retainMissingValue":false,
  "lookup":{"type": "map", "map":{"key":"value"}, "isOneToOne":false}
}

Limit Spec¶

Default LimitSpec¶

{
    "type"    : "default",
    "limit"   : <integer_value>,
    "columns" : [list of OrderByColumnSpec],
}

OrderByColumnSpec¶

{
    "dimension" : "<Any dimension or metric name>",
    "direction" : <"ascending"|"descending">,
    "dimensionOrder" : <"lexicographic"(default)|"alphanumeric"|"strlen"|"numeric">
}

Having¶

Query의 결과에 대해서 Filtering

{
    "type" : "filter",
    "filter" : <any Druid query filter>
}

Query Filter¶

Numeric Filters

Equal To

{
    "type": "equalTo",
    "aggregation": "<aggregate_metric>",
    "value": <numeric_value>
}

Greater Than

{
    "type": "greaterThan",
    "aggregation": "<aggregate_metric>",
    "value": <numeric_value>
}

Less Than

{
    "type": "lessThan",
    "aggregation": "<aggregate_metric>",
    "value": <numeric_value>
}

Dimension Selector Filter

dimSelector

{
    "type": "dimSelector",
    "dimension": "<dimension>",
    "value": <dimension_value>
}

Logic expression Filters

AND

{
    "type": "and",
    "havingSpecs": [<having clause>, <having clause>, ...]
}

OR

{
    "type": "or",
    "havingSpecs": [<having clause>, <having clause>, ...]
}

NOT

{
    "type": "not",
    "havingSpec": <having clause>
}

Aggregation Granularity¶

Simple Granularities¶

all, none, second, minute, fifteen_minute, thirty_minute, hour, day, week, month, quarter, year 의 string 값을 지원함

all은 모든 bucket의 데이터를 하나의 bucket으로 담음.
none은 bucket에 담지 않고 index의 granularity를 사용함. TimeSeries Query에서는 none은 비추!

Query Granularity의 값이 Ingestion Granularity의 값보다 작으면 의미가 없음.(인덱싱이 되지 않아서)

Duration Granularities¶

milliseconds 단위로 기간을 지정해서 사용할 수 있음

{
    "type": "duration",
    "duration": 3600000,
    "origin": "2018-01-01T00:30:00Z"
}

origin의 기본 값은 1970-01-01T00:00:00Z

Period Granularities¶

years, months, weeks, day, hours, minutes, seconds 단위로 기간을 지정

ISO 8601 포멧을 따른다함(P1Y, P2M, P3W, P4H, PT1H30M 등)

{
    "type": "period",
    "period": "P3M",
    "timeZone": "America/Los_Angeles",
    "origin": "2012-02-01T00:00:00-08:00"
}

timeZone은 Joda Time Library에서 지원하는 것을 사용할 수 있음

Query Filter¶

Selector Filter¶

"filter": {
    "type": "selector",
    "dimension": <dimension_string>,
    "value": <dimension_value_string>
}

Column Comparison Filter¶

"filter": { 
    "type": "columnComparison",
    "dimensions": [<dimension_a>, <dimension_b>] 
}

Regular Expression Filter¶

"filter": { 
    "type": "regex",
    "dimension": <dimension_string>,
    "pattern": <pattern_string>
}

Logical Expression Filter¶

AND

"filter": { "type": "and", "fields": [<filter>, <filter>, ...] }

OR

"filter": { "type": "or", "fields": [<filter>, <filter>, ...] }

NOT

"filter": { "type": "not", "field": <filter> }

Javascript Filter¶

return value가 true인 것들만 남김

{
    "type" : "javascript",
    "dimension" : <dimension_string>,
    "function" : "function(value) { <...> }"
}

Extraction Filter¶

extraction function을 사용해서 필터링

Search Filter¶

{
    "filter": {
        "type": "search",
        "dimension": <dimension_string>,
        "query": {
          "type": <search_query_type>,
          "value": "value_1" 
        }        
    }
}

Contains

property	description	required?
type	This String should always be "contains".	yes
value	A String value to run the search over.	yes
caseSensitive	Whether two string should be compared as case sensitive or not	no (default == false)

Insensitive Contains

property	description	required?
type	This String should always be "insensitive_contains".	yes
value	A String value to run the search over.	yes

Fragment

property	description	required?
type	This String should always be "fragment".	yes
values	A JSON array of String values to run the search over.	yes
caseSensitive	Whether strings should be compared as case sensitive or not. Default: false(insensitive)	no

In Filter¶

{
    "type": "in",
    "dimension": <dimension_string>,
    "values": ["value_1", "value_2", "value_3"]
}

Like Filter¶

{
    "type": "like",
    "dimension": "last_name",
    "pattern": "D%"
}

property	type	description	required?
type	String	This should always be "like".	yes
dimension	String	The dimension to filter on	yes
pattern	String	LIKE pattern, such as "foo%" or "___bar".	yes
escape	String	An escape character that can be used to escape special characters.	no
extractionFn	Extraction function	Extraction function to apply to the dimension	no

Bound Filter¶

Interval Filter¶

Aggregations¶

Count Aggregator¶

row count를 기반으로 counting함

{ "type" : "count", "name" : <output_name> }

Sum Aggregator¶

longSum, doubleSum, floatSum
doubleMin, doubleMax
floatMin, floatMax

First/Last Aggregator¶

doubleFirst, floaFirst, longFirst
doubleLast, floatLast, longLast

Javascript Aggregator¶

{ "type": "javascript",
  "name": "<output_name>",
  "fieldNames"  : [ <column1>, <column2>, ... ],
  "fnAggregate" : "function(current, column1, column2, ...) {
                     <updates partial aggregate (current) based on the current row values>
                     return <updated partial aggregate>
                   }",
  "fnCombine"   : "function(partialA, partialB) { return <combined partial results>; }",
  "fnReset"     : "function()                   { return <initial value>; }"
}

Approximate Aggregations¶

Cardinality Aggregator¶

dimension의 cardinality를 HyperLogLog Algorithm을 사용하여 계산함.

{
  "type": "cardinality",
  "name": "<output_name>",
  "fields": [ <dimension1>, <dimension2>, ... ],
  "byRow": <false | true> # (optional, defaults to false),
  "round": <false | true> # (optional, defaults to false)
}

Cardinality by value¶

byRow를 false를 해놓으면(default false) 값에 대해서 cadinality를 계산함

SELECT COUNT(DISTINCT(dimension)) FROM <datasource>;

SELECT COUNT(DISTINCT(value)) FROM (
  SELECT dim_1 as value FROM <datasource>
  UNION
  SELECT dim_2 as value FROM <datasource>
  UNION
  SELECT dim_3 as value FROM <datasource>
);

Cardinality by row¶

byRow를 true 이면 dimension의 조합 cardinality를 계산함

SELECT COUNT(*) FROM ( SELECT DIM1, DIM2, DIM3 FROM <datasource> GROUP BY DIM1, DIM2, DIM3 )

HyperUnique Aggregator¶

{ 
  "type" : "hyperUnique",
  "name" : <output_name>,
  "fieldName" : <metric_name>,
  "isInputHyperUnique" : false,
  "round" : false
}

Miscellaneous Aggregations¶

Filtered Aggregator¶

필터링 된 값만 집계함

필터 + 집계를 한번에 할 수 있음

{
  "type" : "filtered",
  "filter" : {
    "type" : "selector",
    "dimension" : <dimension>,
    "value" : <dimension value>
  }
  "aggregator" : <aggregation>
}

Post Aggregations¶

aggregation된 결과에 다른 처리를 해야하는 경우 사용

Arithmetic post-aggregator¶

postAggregation : {
  "type"  : "arithmetic",
  "name"  : <output_name>,
  "fn"    : <arithmetic_function>,
  "fields": [<post_aggregator>, <post_aggregator>, ...],
  "ordering" : <null (default), or "numericFirst">
}

arithmetic_function = [+, -, *, /, quotient]

Field accessor post-aggregators¶

Constant post-aggregator¶

Greatest / Least post-aggregators¶

JavaScript post-aggregator¶

HyperUnique Cardinality post-aggregator¶

union dataSource를 사용하려면 쿼리를 Broker에게 보내야함