![]() |
Home | Libraries | People | FAQ | More |
Client-side SQL formatting allows running SQL queries with user-supplied parameters securely. It can be used as a simpler and more flexible alternative to prepared statements. While prepared statements expand queries server-side, SQL formatting does it client-side. Please read the comparison with prepared statements and the security considerations sections for more info.
with_params
takes a SQL query string with placeholders and a set of parameters. When passed
to execute
or async_execute
,
the query is expanded in the client with the supplied parameters and sent to
the server for execution:
std::string employee_name = get_name(); // employee_name is an untrusted string results result; // Expand the query and execute it. The expansion happens client-side. // If employee_name is "John", the executed query would be: // "SELECT id, salary FROM employee WHERE last_name = 'John'" conn.execute( with_params("SELECT id, salary FROM employee WHERE last_name = {}", employee_name), result );
Curly braces ({}
) represent placeholders
(technically called replacement fields). The notation
and semantics are similar to std::format
.
All fundamental types can be used as query parameters. This includes integers, floating point types, strings, blobs, dates and times:
// Will execute "SELECT id FROM employee WHERE salary > 42000" results result; conn.execute(with_params("SELECT id FROM employee WHERE salary > {}", 42000), result);
std::optional<T>
and
boost::optional<T>
can
also be used:
std::optional<std::int64_t> salary; // get salary from a possibly untrusted source results result; // Depending on whether salary has a value or not, executes: // "UPDATE employee SET salary = 42000 WHERE id = 1" // "UPDATE employee SET salary = NULL WHERE id = 1" conn.execute(with_params("UPDATE employee SET salary = {} WHERE id = {}", salary, 1), result);
Collections and ranges are supported, as long as its elements can be formatted:
results result; std::vector<long> ids{1, 5, 20}; // Executes "SELECT * FROM employee WHERE id IN (1, 5, 20)" conn.execute(with_params("SELECT * FROM employee WHERE id IN ({})", ids), result);
See this section for more on formatting ranges, and this table for a reference of types that have built-in support for SQL formatting.
![]() |
Note |
---|---|
Like with |
Like std::format
, you can use arguments with explicit
indices:
// Recall that you need to set connect_params::multi_queries to true when connecting // before running semicolon-separated queries. Executes: // "UPDATE employee SET first_name = 'John' WHERE id = 42; SELECT * FROM employee WHERE id = 42" results result; conn.execute( with_params( "UPDATE employee SET first_name = {1} WHERE id = {0}; SELECT * FROM employee WHERE id = {0}", 42, "John" ), result );
See this section for a reference on the format string syntax.
Not all values can be formatted. If the library finds that formatting a certain value can cause an ambiguity that could lead to a security problem, an error will be issued and the query won't be sent to the server. Here are the most common errors:
client_errc::invalid_encoding
blob
and blob_view
types for values that don't represent character strings, but arbitrary
binary values.
client_errc::unformattable_value
double
contains a
NaN
or an Inf
, unsupported by MySQL.
NULL
before passing them to client-side SQL formatting.
client_errc::unknown_character_set
reset_connection
or if you used a custom connect_params::connection_collation
when connecting.
connection_pool
instead of manually resetting connections. If you can't, use the
default connection_collation
when connecting, and use set_character_set
or async_set_character_set
after resetting connections.
For example:
try { // If the connection is using UTF-8 (the default), this will throw an error, // because the string to be formatted is not valid UTF-8. // The query never reaches the server. results result; conn.execute(with_params("SELECT {}", "bad\xff UTF-8"), result); } catch (const boost::system::system_error& err) { BOOST_TEST(err.code() == boost::mysql::client_errc::invalid_encoding); }
with_params
is handy, but may
fall short in some cases involving queries with complex logic. For these cases,
you can use format_sql
and format_sql_to
to expand a query without executing it. These APIs don't involve communication
with the server.
format_sql
is the simplest, and is akin to std::format
:
// Compose the SQL query without executing it. // format_opts returns a system::result<format_options>, // contains settings like the current character set. // If the connection is using an unknown character set, this will throw an error. std::string query = boost::mysql::format_sql( conn.format_opts().value(), "SELECT id, salary FROM employee WHERE last_name = {}", "Doe" ); BOOST_TEST(query == "SELECT id, salary FROM employee WHERE last_name = 'Doe'");
format_sql
requires a format_options
instance describing connection configuration, like the character set currently
in use. any_connection::format_opts
provides an easy way to retrieve these. This
section contains more info about format_opts
.
Some use cases, usually involving conditionals, may not be expressible in terms
of a single format string. In such cases, you can use format_context
and format_sql_to
to build query strings incrementally:
// Compose a query that retrieves all employees in a company, // with an optional limit std::string compose_select_query( boost::mysql::format_options opts, string_view company_id, std::optional<long> limit ) { // format_context will accumulate the query as we compose it boost::mysql::format_context ctx(opts); // format_sql_to expands a format string and appends the result // to a format context. This way, we can build our query in smaller pieces // Add all the query except for the LIMIT clause boost::mysql::format_sql_to(ctx, "SELECT * FROM employee WHERE company_id = {}", company_id); if (limit) { // Add the LIMIT clause boost::mysql::format_sql_to(ctx, " LIMIT {}", *limit); } // Retrieve the generated query string. // get() returns a boost::system::result<std::string> that // contains an error if any of the format operations failed. // Calling value() will throw on error, like format_sql does return std::move(ctx).get().value(); }
std::string query = compose_select_query(conn.format_opts().value(), "HGS", {}); BOOST_TEST(query == "SELECT * FROM employee WHERE company_id = 'HGS'"); query = compose_select_query(conn.format_opts().value(), "HGS", 50); BOOST_TEST(query == "SELECT * FROM employee WHERE company_id = 'HGS' LIMIT 50");
sequence
uses this feature to make formatting ranges easier.
Any type that works with with_params
also does with format_sql
and
format_sql_to
. These types
are said to satisfy the Formattable
concept. This table summarizes
such types.
The sequence
function can be used when the default range formatting isn't sufficient. If
the elements in your range are not formattable, you can pass a user-defined
function to sequence
describing
how to format each element:
// Employee is a plain struct, not formattable by default std::vector<employee> employees{ {"John", "Doe", "HGS"}, {"Kate", "Smith", "AWC"}, }; std::string query = format_sql( conn.format_opts().value(), "INSERT INTO employee (first_name, last_name, company_id) VALUES {}", sequence( employees, [](const employee& e, format_context_base& ctx) { // This function will be called for each element in employees, // and should format a single element into the passed ctx. // Commas will be inserted separating elements. format_sql_to(ctx, "({}, {}, {})", e.first_name, e.last_name, e.company_id); } ) ); BOOST_TEST( query == "INSERT INTO employee (first_name, last_name, company_id) VALUES " "('John', 'Doe', 'HGS'), ('Kate', 'Smith', 'AWC')" );
By default, elements are separated by commas, but this is configurable:
// A collection of filters to apply to a query std::vector<std::pair<string_view, string_view>> filters{ {"company_id", "HGS" }, {"first_name", "John"}, }; std::string query = format_sql( conn.format_opts().value(), "SELECT * FROM employee WHERE {}", sequence( filters, [](const std::pair<string_view, string_view>& f, format_context_base& ctx) { // Compose a single filter format_sql_to(ctx, "{:i} = {}", f.first, f.second); }, " AND " // glue string: separate each element with AND clauses ) ); BOOST_TEST(query == "SELECT * FROM employee WHERE `company_id` = 'HGS' AND `first_name` = 'John'");
You can use sequence
and with_params
together.
By default, sequence
copies
the range you pass as parameter, making it safer for async code. You can use
std::reference_wrapper
or std::span
to
avoid such copies.
Some types, like strings, can be formatted in multiple ways. As with std::format
,
you can select how to format them using format specifiers.
As we've seen, strings are formatted as single-quoted values by default. If
you use the {:i}
specifier, you can obtain dynamic SQL identifiers,
instead:
std::string query = boost::mysql::format_sql( conn.format_opts().value(), "SELECT id, last_name FROM employee ORDER BY {:i} DESC", "company_id" ); BOOST_TEST(query == "SELECT id, last_name FROM employee ORDER BY `company_id` DESC");
Specifiers are compatible with explicit indices and named arguments, too. This is equivalent to the previous snippet:
std::string query = boost::mysql::format_sql( conn.format_opts().value(), "SELECT id, last_name FROM employee ORDER BY {0:i} DESC", "company_id" );
Although both serve a similar purpose, they are fundamentally different. Prepared statements are parsed and expanded by the server. Client-side SQL expands the query in the client and sends it to the server as a string.
This means that client-side SQL does not understand your queries. It just knows about how to format MySQL types into a string without creating vulnerabilities, but otherwise treats your queries as opaque strings. Client-side SQL yields greater flexibility (you can dynamically compose any query), while statements have more limitations. This also means that you need to pay more attention to compose valid queries, specially when dealing with complex conditionals. Logic errors may lead to exploits. Please read the security considerations section for more info.
Client-side SQL entails less round-trips to the server than statements, and is usually more efficient for lightweight queries. However, it uses the less compact text protocol, which may be slower for queries retrieving a lot of data. See the efficiency considerations section for more info.
In general, use client-side SQL formatting for the following cases:
with_params
and only switch to statements
if your performance measurements says so.
INSERT
that inserts several rows
at once (see example
1 and example
2).
UPDATE
must be dynamic (example).
On the other hand, prefer prepared statements if:
Both client-side SQL formatting and prepared statements have pros and cons efficiency-wise:
connection_pool
with prepared statements, you can't use pooled_connection::return_without_reset
,
as this will leak the statement. With client-formatted queries, reset may
not be required if your SQL doesn't mutate session state.
Both client-side SQL formatting and prepared statements protect against SQL injection. Statements do so by parsing the query with placeholders server-side, before performing parameter substitution. Client-side SQL quotes and escapes your values to avoid injection, but does not understand your queries.
This means that you need to ensure that your queries always expand to valid SQL. This is trivial for simple queries, but may be an issue with more complex ones, involving ranges or dynamic identifiers. For instance, the following query may expand to invalid SQL if the provided range is empty:
// If ids.empty(), generates "SELECT * FROM employee WHERE id IN ()", which is a syntax error. // This is not a security issue for this query, but may be exploitable in more involved scenarios. // Queries involving only scalar values (as opposed to ranges) are not affected by this. // It is your responsibility to check for conditions like ids.empty(), as client-side SQL // formatting does not understand your queries. std::vector<int> ids; auto q = format_sql(conn.format_opts().value(), "SELECT * FROM employee WHERE id IN ({})", ids);
The risk is higher if you're building your query by pieces using format_sql_to
.
To sum up:
format_sql_to
,
unless you have no other option.
utf8mb4
).
SET NAMES
or SET
CHARACTER SET
statements directly - use any_connection::set_character_set
or async_set_character_set
,
instead.
format_sql
or format_sql_to
,
never craft format_options
values manually. Use any_connection::format_opts
,
instead.
C++ type |
Formatted as... |
Example |
---|---|---|
|
Integral literal |
format_sql(opts, "SELECT {}", 42) == "SELECT 42" format_sql(opts, "SELECT {}", -1) == "SELECT -1" |
|
Integral literal |
format_sql(opts, "SELECT {}", 42u) == "SELECT 42" |
|
Integral literal |
format_sql(opts, "SELECT {}", false) == "SELECT 0" format_sql(opts, "SELECT {}", true) == "SELECT 1" |
String types (convertible to |
Without format specifiers: single-quoted escaped string literal.
Note that |
// Without format specifier: escaped, quoted string value format_sql(opts, "SELECT {}", "Hello world") == "SELECT 'Hello world'" format_sql(opts, "SELECT {}", "Hello 'world'") == R"(SELECT 'Hello \'world\'')" // {:i}: escaped, quoted dynamic identifier format_sql(opts, "SELECT {:i} FROM t", "salary") == "SELECT `salary` FROM t" format_sql(opts, "SELECT {:i} FROM t", "sal`ary") == "SELECT `sal``ary` FROM t" // {:r}: raw, unescaped SQL. WARNING: incorrect use can cause vulnerabilities format_sql(opts, "SELECT * FROM t WHERE id = 42 {:r} salary > 20000", "OR") == "SELECT * FROM t WHERE id = 42 OR salary > 20000" |
Blob types (convertible to |
Hex string literal |
format_sql(opts, "SELECT {}", blob{0x00, 0x48, 0xff}) == R"(SELECT x'0048ff')" |
|
Floating-point literal, after casting to |
// Equivalent to format_sql(opts, "SELECT {}", static_cast<double>(4.2f)) // Note that MySQL uses doubles for all floating point literals format_sql(opts, "SELECT {}", 4.2f) == "SELECT 4.199999809265137e+00" |
|
Floating-point literal. |
format_sql(opts, "SELECT {}", 4.2) == "SELECT 4.2e+00" |
Single quoted, |
format_sql(opts, "SELECT {}", date(2021, 1, 2)) == "SELECT '2021-01-02'" |
|
Single quoted |
format_sql(opts, "SELECT {}", datetime(2021, 1, 2, 23, 51, 14)) == "SELECT '2021-01-02 23:51:14.000000'" |
|
Single quoted |
format_sql(opts, "SELECT {}", std::chrono::seconds(121)) == "SELECT '00:02:01.000000'" |
|
|
|
format_sql(opts, "SELECT {}", nullptr) == "SELECT NULL" |
|
Formats the underlying value if there is any. |
format_sql(opts, "SELECT {}", optional<int>(42)) == "SELECT 42" format_sql(opts, "SELECT {}", optional<int>()) == "SELECT NULL" |
|
Formats the underlying value. |
format_sql(opts, "SELECT {}", field(42)) == "SELECT 42" format_sql(opts, "SELECT {}", field("abc")) == "SELECT 'abc'" format_sql(opts, "SELECT {}", field()) == "SELECT NULL" |
Range of formattable elements. Informally, such ranges support
Ranges of ranges are not supported. Note that
See |
Formats each element in the range, separating elements with commas. |
// long is a WritableField format_sql(opts, "SELECT {}", std::vector<long>{1, 5, 20}) == "SELECT 1, 5, 20" // C++20 ranges and other custom ranges accepted format_sql(opts, "SELECT {}", std::vector<long>{1, 5, 20} | std::ranges::views::take(2)) == "SELECT 1, 5" // Apply the 'i' specifier to each element in the sequence format_sql( opts, "SELECT {::i} FROM employee", std::vector<string_view>{"first_name", "last_name"} ) == "SELECT `first_name`, `last_name` FROM employee" |
|
Formats each element in a range by calling a user-supplied function,
separating elements by a glue string (a comma by default). |
format_sql( opts, "SELECT {}", sequence( std::vector<int>{1, 5, 20}, [](int val, format_context_base& ctx) { format_sql_to(ctx, "{}+1", val); } ) ) == "SELECT 1+1, 5+1, 20+1" |
Custom type that specializes |
Calls |
|
Formats the underlying value. Can represent any of the types above. |
format_sql(opts, "SELECT {}", formattable_ref(42)) == "SELECT 42" format_sql(opts, "SELECT {:i} FROM t", formattable_ref("salary")) == "SELECT `salary` FROM t" |