Verbiage: Changes i'd like to make to SQL

Journal Chacham's Journal: Verbiage: Changes i'd like to make to SQL 2

Journal by Chacham on Wednesday November 25, 2009 @01:02PM

Add * - col to get everything *but* a specific COLUMN
Standardize order for a TABLE's COLUMNs
Add pseudo-column to refer to a COLUMN by order
Add pseudo-column refer to the PK
Add pseudo-column refer to the FK in a join
Deprecate SELECT clause from EXISTS()
Make FROM keyword optional in an EXISTS()
Deprecate "ANSI-style outer joins"
Add an aggregate for string concatination
(Change UNION to mean UNION ALL, remove UNION ALL)
Add UNION DISTINCT
Add WITH KEY CHECKING clause INSERT and UPDATE

Add * - col to get everything *but* a specific COLUMN

* is used in ad-hoc queries. We want to see a few COLUMN but don;t care to specify which ones. * is just fine, except when one of those COLUMNs in a LOB or anything else that just goes too wide. For convenience, * - col1[, col2...] should work.

Standardize order for a TABLE's COLUMNs.

DBs store COLUMNs in order. A INSERT without a COLUMN specifier uses it. SELECT * uses it. It just isn't official. Good for ad-hoc queries.

Add pseudo-column to refer to a COLUMN by order

A UNION's ORDER BY can use a number, as a COLUMN name might not exist. Why not allow this for all ORDER BYs? The ORDER BY must happen after the SELECT because that's the data it works with, yet sometimes we have to repeat some complex syntax. Also, this would remove half the reason for dynamic SQL.

COL[UMN]() ought to do it, because COLUMN is already a keyword (in ALTER TABLE). Though, it might be good to start all keywords with *, so *(1) (as opposed to * 1 which means to multiple by 1) or *COL(1) might be good candidates. Or even a SQL-like SELECT COLUMN 1. But that just seems wrong because it's not refering to the COLUMN itself as in ALTER TABLE.

For example: SELECT COLUMN(1), COL(42) FROM ...

Add pseudo-column refer to the PK

How often during diagnostics do we want to grab the data associated with a PK? Often enough, i say.

For example: SELECT * FROM Moo WHERE *PK = 42;

Add pseudo-column refer to the FK in a join

Joins are usually on the FK. Usually the whole FK. This would be a convenience. It'd also make it easier to read the query and should an FK change, no reason to change the queries.

This is somewhat similar to the FK definition's REFERENCE clause, which refers to the target's PK by default.

For example: WHERE tab1.*FK = tab2.*FK

Deprecate SELECT clause from EXISTS()

SELECT in EXISTS is redundant. It also leads to SELECT 1 vs SELECT * debates, and confuses newcomers. Just get rid of it.

Make FROM keyword optional in an EXISTS()

An EXISTS clause must have a FROM (unless a pseudo-TABLE can be SELECTed which is pointless in an EXISTS) clause, and it is the first useful thing in the EXISTS, hence FROM is implied. Don't remove it, because it is implied, allow it to be explicit. It's also help readability to have it.

Deprecate "ANSI-style outer joins"

ANSI style joins are ugly, confusing, and wrong.

Ugly: clutters the FROM clause; adds a new ON keyword; forces all TABLE's in the ON clause to use the ANSI style as well

Confusing: nested joins are hard to follow; separates the join clause from other WHERE clauses

Wrong: FROM includes, WHERE exclude. a join exlcudes, therefore it belongs in the WHERE clause; standardizes RIGHT and LEFT as keywords which make little sense (a join should not refer to geography in the statement itself); standardize OUTER and INNER which are common terms but do not have any real relevance to the type of join.

When i first learnt SQL, i read Understanding SQL. He uses inner and outer as standard aliases for correlated sub-queries (that self-join). That make a lot of sense. Same TABLE, one is in the inner context, the other in the outer context. Now those are keywords and i have to stop using them, very annoying. And they are keywords for something that has nothing to do with being inner or outer.

Oracle's approach was good. Adding (+) to the end of a COLUMN reference made it a super-COLUMN, it that it had NULLs when no match was found. Pro: Attached to COLUMN itself which is what ultimately returns the NULL for a non-match. Con: It's the join that is special, not the COLUMN.

SQL Server wasn't bad with *= and =* making the equals be special, in that it didn't limit it to an exact match. Pro: It's the join clause that was marked as special, intuitive. Con it's the COLUMN that returns NULL, not the join.

Mixing the two: COL(+) *= COL is redundant.

Add an aggregate for string concatination

Following Oracle and SQL Server forums, this is a very common question. How do i aggregate?. There are tips, tricks, custom FUNCTIONs, and the like. The other aggregation FUNCTIONs aggregates and compute, and the compute obviates the aggregated data itself. MIN and MAX works like this, only keeping the current minimum or maximum scalar value. SUM, COUNT, AVG, and the like keep all the VALUEs (supporting DISTINCT as well) but return a new scalar. Concatenation would essentials return everything, just as one value. Perhaps this difference kept it away from being coded.

Delimiter would be any character(s). Have an option to allow keeping or removing the trailing delimiter, whether to delimit NULLs, and the like. Should support DISTINCT.

For example: CONCAT(val, delimiter, option)

(Change UNION to mean UNION ALL, remove UNION ALL)

If i was there when UNION was first implemented, i would have vehemently opposed UNION implying DISTINCT. I have to add a keyword to not do something automagically?? What were they thinking?

Also, many users use UNION when they should be using UNION ALL. Either wondering why DISTINCT was applied, or wasting cycles (DISTINCT is an expensive operation). It just isn't intuitive.

Unfortunately, it's been there for too long. I don't think it could be changed.

Add UNION DISTINCT

If UNION means what UNION ALL means, UNION DISTINCT would be required.

Currently, UNION generates the enitre list, then applies a DISTINCT. In a sense, UNION DISTINCT should remove from the second (the UNIONed) query that which already exists in the first query. It should actually apply DISTINCT to the first query too. If i wanted that, i'd add DISTINCT to the first query. If it doesn't appear in the first query, it should not apply a DISTINCT to the second query. If i wanted that, i'd add DISTINCT to the second query.

I guess if it were up to me, i'd abolish UNION and add a new keyword. APPEND perhaps?

Add WITH KEY CHECKING clause INSERT and UPDATE

When i write stored PROCEDUREs to do INSERTs, i have a "standard" to check the PK and all FKs so the INSERT will not fail. Similarly for UPDATEs where the key changes. That is, use an EXISTS for each KEY. I'd like to see that added.

One, it would (have an option to) allow the rest of the set to go through. Two, it would encourage more people to use it, resulting it more robust code. Three, it would remove alot of clutter. Four, (in the case of a PK or UNIQUE check) it would tell the optimizer that the subquery is not a new TABLE saving wasteful checking time. Five, it would remove the need for a DISTINCT in the SELECT subquery, in a case where the SELECT may pull up duplicates breaking te PK on the second one. DISTINCT is an expensive operation. The DB writer has to check the PK/UNIQUE KEY anyway, why use a wasteful DISTINCT?

This discussion has been archived. No new comments can be posted.