This Week in Databend #94
May 21, 2023 · 4 min read
PsiACE
Stay up to date with the latest weekly developments on Databend!
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
What's On In Databend
Stay connected with the latest news about Databend.
Computed Columns
Computed columns are generated from other columns by a scalar expression. There are two types of computed columns: stored and virtual.
A stored computed column computes and stores the result value when a row is inserted. Use this SQL syntax to create one:
column_name <type> AS (<expr>) STORED
While a virtual computed column is calculated at query time and does not store the result value. To create one, use this SQL syntax:
column_name <type> AS (<expr>) VIRTUAL
VACUUM TABLE
The VACUUM TABLE
command helps to optimize the system performance by freeing up storage space through the permanent removal of historical data files from a table. This includes:
- Snapshots associated with the table, as well as their relevant segments and blocks.
- Orphan files. Orphan files in Databend refer to snapshots, segments, and blocks that are no longer associated with the table. Orphan files might be generated from various operations and errors, such as during data backups and restores, and can take up valuable disk space and degrade the system performance over time.
VACUUM TABLE
requires Enterprise Edition. To inquire about upgrading, please contact Databend Support.
If you are interested in learning more, please check out the resources listed below:
Code Corner
Discover some fascinating code snippets or projects that showcase our work or learning journey.
Enable Cache in Python Binding
Databend supports data caching and query result caching, which can effectively accelerate queries. The Python bindings of Databend also support these features, albeit with slight differences.
For query result caching, SQL statements can be used to set it up, which is very convenient.
>>> from databend import SessionContext
>>> ctx = SessionContext()
>>> ctx.sql("set enable_query_result_cache = 1")
For data caching, it can be enabled through environment variables.
>>> import os
>>> os.environ["CACHE_DATA_CACHE_STORAGE"] = "disk"
>>> from databend import SessionContext
>>> ctx = SessionContext()
>>> ctx.sql("select * from system.configs where name like '%data_cache%'")
┌────────────────────────────────────────────────────────────────────────────┐
│ group │ name │ value │ description │
│ String │ String │ String │ String │
├─────────┼──────────────────────────────────────────┼─────────┼─────────────┤
│ 'cache' │ 'data_cache_storage' │ 'disk' │ '' │
│ 'cache' │ 'table_data_cache_population_queue_size' │ '65536' │ '' │
└────────────────────────────────────────────────────────────────────────────┘
Feel free to use it in your data science workflow:
Highlights
Here are some noteworthy items recorded here, perhaps you can find something that interests you.
- Read Docs | Date & Time - Formatting Date and Time to learn how to precisely control the format of time and date.
- Added support for transforming data when loading it from a URI.
- Added support for replacing with stage attachment.
- Added bitmap-related functions:
bitmap_contains
,bitmap_has_all
,bitmap_has_any
,bitmap_or
,bitmap_and
,bitmap_xor
, etc. - Supported
intdiv
operator//
.
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Remove if_not_exists
from the Meta Request
In CreateIndexReq
/CreateTableReq
, we use if_not_existed
to indicate whether an index/table exists.
pub struct CreateIndexReq {
pub if_not_exists: bool,
pub name_ident: IndexNameIdent,
pub meta: IndexMeta,
}
The if_not_exists
clause only affects the outcome that is presented to the user, and does not alter the behavior of the meta-service operation.
Therefore, it will be more effective for SchemaApi
to provide either a Created or an Exist status code, allowing the caller to determine whether to generate an error message.
Issue #11456 | Moving if_not_exists out of meta request body
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.
New Contributors
We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.
- @silver-ymz made their first contribution in #11487. Added five bitmap-related functions.
- @Jake-00 made their first contribution in #11503. Modified duplicate test case for
SOUNDS LIKE
syntax. - @gitccl made their first contribution in #11507. Added five bitmap-related functions and fixed panic when calling with empty bitmap.
Changelog
You can check the changelog of Databend Nightly for details about our latest developments.
Full Changelog: https://github.com/datafuselabs/databend/compare/v1.1.38-nightly...v1.1.43-nightly
🎉 Contributors 24 contributors
Thanks a lot to the contributors for their excellent work.
🎈Connect With Us
Databend is a cutting-edge, open-source cloud-native warehouse built with Rust, designed to handle massive-scale analytics.
Join the Databend Community to try, get help, and contribute!