phpgt / database Goto Github PK

Database query organisation.

License: MIT License

PHP 100.00%

database-logic-organisation sql-query database-migrations database sql-database sql database-access database-api-organisation php-library php-pdo

database's Introduction

Database API organisation.

Encloses your application's database scripts within a simple and standardised interface, separating database access from your application logic.

The first parameter to any database functions is always the query name, which represents a query file on disk - either a raw SQL file or a PHP representation of a query using SqlBuilder.

Example usage

This library organises SQL access through a consistent API. To execute an example query located at src/query/user/getById.sql, the following pattern is used:

$userRow = $db->fetch("user/getById", 105);

Examples of CRUD operations:

// "fetchAll" method returns an iterable ResultSet of Row objects.
$bookResultSet = $db->fetchAll("shopitem/getItemsInCategory", "books");

foreach($bookResultSet as $bookRow) {
	echo "Book title: ", $bookRow->getString("title"), PHP_EOL;
	echo "Book price: £", ($bookRow->getFloat("price") + $bookRow->getFloat("vat")), PHP_EOL;
	
	if($bookRow->offerEnds) {
		echo "Item on offer until: ", $bookRow->getDateTime("offerEnds")->format("dS M Y");
	}
}

// "Create" method always returns the last inserted ID:
$newCustomerId = $db->create("customer/new", [
	"first_name" => "Marissa",
	"last_name" => "Mayer",
	"dob" => new DateTime("1975-05-30"),
]);

// "Update" or "delete" methods always return the number of affected rows:
$numberOfItemsAffected = $db->update("shop/item/increasePrice", [
	"percent" => 12.5,
	"max_increase" => 20.00,
]);

$numberOfDeletedReviews = $db->delete(
	"remove/deleteOlderThan",
	new DateTime("-6 months")
);

// Individual type-safe fields can be pulled from queries that return only one column:
$userName = $db->fetchString("user/getUsernameById", 105);

Features at a glance

database's People

Contributors

Stargazers

Watchers

Forkers

peter279k isabella232 zahidsqldba

database's Issues

Shorthand CRUD methods.

When calling a query shorthand, infer the CRUD method from the start of the query name:

getById = retrieve
createNewUser = create
etc.

How should empty() handle missing fields?

$row = $db->fetch("some/query");

if(empty($row->missing_field)) {
// ...
}

Currently throws NoSuchColumnException.

folder structure

How about the following folder structure for users to organise their sql files in their own project:

- src
   L ddl
       L mysql
           L <whatever>
       L sqlite
           L <whatever>
       L <etc>
   L dml
       L mysql
           L <whatever>
       L sqlite
           L <whatever>
       L <etc>

Should a Gt PreparedStatementException take-on the code from the exception it wraps?

Either that, or maybe implement some more specific exceptions (such as DuplicateKeyException). Currently, the PreparedStatementException has a code of 0 - you have to call getPrevious to access the underlying PDOException and access its description and error code. Anything useful actually.

If we don't either inherit the code or throw some more specific exceptions it would be better to simply drop the wrapper exception and let the PDOException bubble up.

Specify in README when QueryBuilder will be ready

The QueryBuilder is tracked in #26 , planned for release in v3, after all other general and advanced features are implemented.

README is currently misleading, not good for v1 release.

Property access for data, ArrayAccess for collections

This should be written into the styleguide.

When accessing a collection of things, such as obtaining reference to a QueryCollection, this should be done using array notation, as it is done currently.

However when accessing a data-like object, it should be done using property notation.

// Property access (new way):
$playerId = $db["Player"]->getByName("Binky")->id;

// Array access (old way):
$playerId = $db["Player"]->getByName("Binky")["id"];

The migration tool "swallows"/ ignores failed inserts

Not sure if it's because PDO doesn't throw an exception for it, but a migration that attempts to insert a null value into a non-nullable column fails to insert the row, but doesn't fail the migration.

Migrator has not been updated to use new Row property access

It is still attempting to use the old array-access notation so fails.

Auto-access to the first row in a ResultSet could be handled explicitly

Currently, accessing a column on the first row (via property-access on the ResultSet) will fail with an unhelpful notice if there are no Rows. A new exception should be added to make it explicit.

Readme example doesn't run

Gt\Database\DatabaseClient::__construct() must be an instance of Gt\Database\Query\QueryCollectionFactory, instance of Gt\Database\Connection\Settings given

The ability to specify a QueryCollectionFactory is great so I can use a different folder for queries (src/sql/query is my personal preference so I can have migrations in src/sql/migrations).

On which note, looking at the source, am I right in thinking that the default base directory for query files is actually getcwd() rather than src/query as specified in the readme example?

CRUD methods.

On the Query, need to be able to call these methods and their synonyms:

Create / Insert
Retrieve / Get
Update
Delete

Instantiate Driver in the QueryCollectionFactory.

Currently, the Settings object is passed around and only converted into a Driver in the Query class. This means there will not be a shared database driver for multiple queries.

Converting it in the QCF allows all QCs and Qs to share the same driver.

Database Migrations ignoring stored count

When running db-migrate, the function "getMigrationCount" does not retun a numeric value causing migrations to be run when they have already been applied.

Should baseDirectory for Migrator actually be the migration directory?

The first parameter for Settings is the base directory - which in the normal QueryCollection context is the parent dir for all query collections.

In the Migrator, this directory is not used as it isn't relevant. It doesn't use query collections (although it is built-up by the db-migrate script for some reason?).

What is relevant for Migrator is the migration path - but that is passed-in as a separate variable to the Migrator constructor.

So... should the baseDirectory in the Settings object actually be set to the migration path for Migrator, removing the need for a separate migrationPath? Or should there be two separate Settings subclasses - one that's relevant for QueryCollections and one that's relevant for Migrator?

The question is born from the possibility (or fact in my case) of the migrations being stored outside the query directory, rather than as a subdir of it.

Templating queries to avoid view inefficiencies

Allow queries to import shared SQL templates for easier and more efficient reuse than using views.

select_customer.sql

select
  first_name,
  last_name,
  gender,
  id_customer,
  customer_type.name as name_customer_type

from customer

inner join customer_type
using(id_customer_type)

get_male_customers.sql

{select_customer}
where
  gender = "m"

Your thoughts, @j4m3s?

Calling a method on a ResultSet "breaks" it

Calling the same method on a ResultSet gives a different result each time. For example below, I call count() twice, and the second time the answer is different to the first. The same happens with jsonSerialize() and presumably other methods too.

$db = new Client($dbSettings);

$clientRecord = $db->queryCollection("client")->query("GetClient", [
    "ID" => "client1"
]);

var_dump(count($clientRecord->count()));
var_dump($clientRecord->count());

Yields:

int 1
int 0

Variable arguments are too deeply wrapped

Looks like a misunderstanding of the new ... operator.

When a query is invoked using a new shorthand method from #51, the following occurs:

an array of parameters are passed in as the second "variable argument" parameter.
this creates an array of the array, which is then passed to the common "query" method.
query also uses varargs, so now we have an array within an array within an array.

The query method was expecting this and setting $bindings = $bindings[0] if it was an array, but due to the new helper methods there is a chance of getting the bindings doubly-wrapped.

Port number is ignored

It appears that the port number is not being used. The Settings::getConnectionSettings() method does not return port in its array, and I can't find any calls to Settings::getPort() (other than Migrator which clones the Settings object).

This has come to light trying to debug a docker issue whereby the Migrator was attempting to migrate the local db (on 3306) rather than the docker one running on a different port.

Run styleguide fixer with composer fix

Add styleguide as a dev dependency.

Ask if it's OK that a migration doesn't match

When a table is migrated, the md5 of the sql is stored.

If the md5 doesn't match the migration next time, an exception will be thrown, but maybe it's OK in some situations?

For example, if a developer goes back to old migrations and alters a comment, as long as he can check that in version control, he will be able to confirm that there is no problem accepting this migration even though it has changed.

ResultSet count() doesn't reset iterator index

If you count a ResultSet, you can no longer foreach over it, because the index is set at the end of the internal array.

How to register custom fetch callbacks?

$customer = $db->fetch("customer/getById", 123);

The $customer object can only ever be an instance of Row, as that's what fetch returns.

Where can the developer specify their own fetch functions?

Base Db class is instantiated on the Page Logic.
TypeHinted to \Gt\Database\Database | \App\Database\Database
Optionally extend base Database class in src/query/db.php
Extended Database defines fetchAsCustomer():MyApp\Model\Customer

Although with the base namespace being dynamic I think this is too far fetched.

This enhances #72

Any tests for the db-migrate script?

was just poking around and it looks like there could be some logic errors in the autoload section of the script. For example I could specify an autoload path but it would be modified (at least prefixed with ../) if I haven't specified the repoBasePath?

(and it's a single 153 line function - uncle Bob wouldn't approve :) )

Remove Illuminate requirement until needed (if ever)

Illuminate's database is currently required to provide schema/query builder. I'm not even sure if this is something people want, so the requirement should be removed until at least v3 and #26 is addressed.

PHP QueryBuilder and SchemaBuilder migrations

This repository has been purposefully built on top of Illuminate's database repository so it can take advantage of the Query and Schema Builder tools.

Given a project that currently interfaces with the database like so:

$customer = $db["customer"]->getById(123)

the query/customer/getById.sql SQL script should be able to be replaced with a query/customer/getById.php PHP class without having to change the project implementation.

SQL and PHP Query Builder files should be interchangeable.

Better shorthand methods

Using magic methods breaks a lot of IDEs and is not great API design. The get* getAll* insert* ideas from #20 are being dropped in favour of this simplified API:

$customer = $db->fetch("customer/getById", 123);
echo "Customer name is " . $customer->first_name;

$numberDeleted = $db->delete("customer/deleteOldCustomers", ["created_at" => new DateTime("-2 years")]);
echo "Deleted $numberDeleted customers!";

Shorthand named QueryCollections.

It's possible to use the $db["qcName"] syntax to get reference to the QueryCollection in the default connection, but what about other connections?

$db["connectionName.qcName"] could work?

Pass a list of parameters for when using `?` placeholders

Certain queries are better to use ? placeholders rather than :named placeholders, typically queries where the placeholder can be inferred by the query name.

For example, the getById query should only take the one parameter, which doesn't need to be named "id" when calling.

Clean, improved syntax:

$customer = $db["customer"]->getById(123);

Rather than:

$customer = $db["customer"]->getById(["id" => 123]);

Should DatabaseClient be renamed to Client?

The Readme refers to Client, but the class is actually called DatabaseClient. My first thought was that DatabaseClient is better than just Client, but having had a look it appears that several other major libraries just use Client. For example, Guzzle, Symfony, Behat, phpleague's OAuth libraries.

Now we have namespaces in PHP, maybe it's ok to stick with just "Client"?

Automatic type casting on Row classes

We need to implement a type casting system so that Row property types are casted automatically as they are fetched from the database.

When the typed properties RFC is passed this will be baked into the language, but we can introduce docblock casting already.

There needs to be a way of "registering" classes (or namespaces of classes) for the QueryCollection by name. For example, $db->fetch("customer/getById", 123); will return a Row, but if there is a registered class for "Customer" that extends Row it will return that instead.

Pre property type hints:

class Customer extends Row {
	/** @var int *//
	public $id;
	/** @var DateTime **/
	public $registered_at;
}

Post property type hints:

class Customer extends Row {
	public int $id;
	public DateTime $registered_at;
}

~~For this functionality to work, the properties must be declared within the class - not as docblock comments to the class. This is for future compatibility when we have typed properties.~~

Update: This functionality can work with DocBlock comments, until typed properties become available! The Reflection API can look at the doc blocks for properties and cast them there.

Integration tests.

The library needs to have tests using it from end to end, rather than just in unit isolation.

I'm not sure of the best approach for this. PHPUnit is probably not the right tool here (?)

The integration tests need to act as documentation, answering the question raised in #11 .

Handle multiple data access styles.

Inline SQL - Illuminate's database works like this.
PHP.Gt style - with SQL in separate files - map to Illuminate's database "capsule".
Laravel style - object-oriented query builder.
PHP.Gt/Laravel hybrid - Instead of Customer/getAll being mapped to Data/Customer/getAll.sql, map to Data/Customer.php::getAll() and use Illuminate to handle and return data.

Keep in mind how Illuminate's object builder is chainable - it will be very useful to be able to add filters and sorts to PHP.Gt's database before the query is made.

Halt migrations if there is not a numerically sequential file list

It is common to have multiple branches from the same point in master that all introduce new migrations. If each branch introduces a new migration, when they are merged into master we will have multiple migrations with the same number.

A check needs to take place before any migrations occur to ensure that all files in the directory are properly numerically sequenced. If there are duplicate numbers or gaps in the sequence then fail - the migrator can't assume the order to run, and the decision must be passed back to the developer.

Could/ should it be possible to specify the database port number?

Rename "database" to schema throughout connection namespace

Separate migrations and view creations?

It is useful to refer back to views in source code, but if they are in amongst the migrations they may be difficult to find.

Also, a future migration is likely to break a view.

Run all view creation scripts after migrations have occurred somehow? @j4m3s what do you think about this?

Using QueryCollection fetch shorthand without params causes error

QueryCollection->query("something");  // works
QueryColelction->something(["binky" => "boo"]); // works
QueryCollection->something();


There was 1 error:

1) Gt\Database\Query\QueryCollectionTest::testQueryShorthand
Undefined offset: 0

Hungarian notation

There are a few cases throughout the code that use Hungarian Notation.

An interface doesn't need to be called SomethingInterface, because the only way it can be used is as an interface. class MyThing implements Something is not any less obvious in intent than class MyThing implements SomethingInterface, and while we're at adding "Interface" to the end of interfaces, why not go the full hog and use class MyThingClass implements SomethingInterface?

While we're at it, the same can be said about exceptions. If your code is catching exceptions, their names don't need to state that they are SomethingExceptions. It's all obvious in context, and even if not, the IDE's job is to tell you.

Thinking about it, there is a case for naming something as an exception when it is a base class, such as the DatabaseException of which all exceptions inherit within this repository.

Persistent connections?

Read through the pros/cons here: http://stackoverflow.com/questions/23432948/fully-understanding-pdo-attr-persistent

Migrations using timestamps

Opportunity for another gt command: migrate.

Migration filenames should use a sequence that can't clash when multiple developers are working concurrently (don't use incremental numbers, for example).

Use the unix timestamp, suffixed with either a description of the migration or a random hash.

IMPORTANT - don't trust the number in the sequence that has been migrated to - two or more developers could have checked out from the same point of migration. Developer A runs their migration of 10 scripts, then developer B's migration number will be less than the current state of the migration.

The migration table must make a record for every script that has been run, by name. The migration script must check each and every script from first to last, running missing scripts in order.

How should we handle if developer B's migrations have side effects to developer A's, or if the side effects of each developer's migrations cause the other's to fail? Manual intervention involving a future fix migration?

Retrieve lastInsertID

How do you retrieve the last commit ID after an insert?

Migrations.

Automatically handle sequential migrations.

Per database, create a gt_migration table (configurable) and store a row for the currently-migrated sequence number.
Before a migration occurs, snapshot the database.
Attempt to apply each migration in sequence.
Rollback to the snapshot if any error occurs.
Update the migration table row with the new sequence number on success.

Anything else to consider?

Better exception models

Rather than throwing the uncaught PDOException, make a better model which relates to the actual problem in the SQL.

We already have QueryCollectionNotFoundException and QueryNotFoundException, when dealing with the actual filesystem, but it would be better to have the PDOExceptions wrapped in a more readable manner.

Prefix is pointless

The SettingsInterface interface does not need to mention or deal with prefixes any more, since the dependency of Illuminate's database is removed.

Drop ArrayAccess

After doing some market research, the ArrayAccess ability of Query Collections is not necessary and will only cause confusion to newcomers.

The same decision has been made with Dom - don't allow a secondary "shorthand" method to access queries - and all PhpGt repositories should match the same coding style.

This should be dropped for v1 as to not cause any backwards breaking changes.

Binding non-column values in prepared statements

The way PDO protects against SQL injection is by asking the database driver for the expected variable type of placeholders. This is done by checking the schema against where the placeholders exist.

When PDO sees a placeholder that isn't bound to a column of specific type, for example: limit :limit, it either chooses to bind NULL or 0, as it doesn't know what to do with the provided value.

Version 2 solved this by checking for predefined strings, such as :limit, and using str_replace on those parameters. But is there a better solution?

Is it possible to set options for PDO?

I frequently re-use named parameter markers to optimise or avoid the need for subqueries. According to the PHP manual:

You cannot use a named parameter marker of the same name more than once in a prepared statement, unless emulation mode is on

The necessary parameter is PDO::ATTR_EMULATE_PREPARES => true

By default it is turned-off in Illuminate database, meaning re-use of parameter markers throws an SQLSTATE[HY093]: Invalid parameter number error.

This does highlight the need to allow configuration of PDO parameters via PhpGt/Database, and possibly also to set this particular parameter to "on" by default.

To configure database parameters in Illuminate database, you add an options key to the associative array of parameters used in Manager->addConnection(), as follows:

use Illuminate\Database\Capsule\Manager as Capsule;

$capsule = new Capsule;
$capsule->addConnection([
    'driver' => 'mysql',
    'host' => $db_host,
    'database' => $db_name,
    'username' => $db_username,
    'password' => $db_password,
    'charset' => 'utf8',
    'collation' => 'utf8_unicode_ci',
    'prefix' => '',
    'options' => [
        PDO::ATTR_EMULATE_PREPARES => true,
    ]
]);