Home Our Concept Services Other Products Shopping Cart Payment Options FAQ Privacy About Us Contact Us
 
Sitemap
You are here: Home > Whitepapers > Store documents as file in filesystem or in a database?

Store Images and Documents as File in a Filesystem or in a Database?

- A discussion of several aspects

To store document attributes (title, author, keywords, version(s) ...) in database tables is common practice.

Those attributes are stored in database fields of data types "number" and "character".

But database systems also provide additional data types to store (large) objects in binary and text format.


This article discusses, if the document (text, image, spreadsheet, ...) itself

  • should be stored in a database field,
  • or just referenced by a pointer to a physical file in the file system.

There exists no general answer – it depends on the individual situation:

Table of Pro- and Contra Arguments

Architecture ==>

Topic

Documents stored in Filesystem

Documents stored in Database

Transactional consistence and durability in case of server-crash (without need to restore)

(+) Advanced file system options avoid loss of data

(-) Simple file systems might cause loss of data from write-back cache which have not been written to disk before crash.

(+) Transaction log files / redo log files ensure durability of committed transactions.

Transactional consistence and durability in case of restore from tape

(-) you can only restore files which have been present at time of last backup. New files or file modifications can't be restored.

(+) When running database in transaction log mode / archive log mode (usual for production databases) you never should use documents, even if you need to restore from tape as you can "roll forward" using transaction log /archive log files.

Performance

For a low / medium number of small images (typically icons for a webserver) the Webserver's caching feature for listed directories is optimal.

In case of very large number (more than 100,000) of files in a single directory, file access can significantly slow down (depends on file system version).

As long as the design principle of session-persistence is followed, performance should be fast. Database Login/Logout per image / document is of course a show stopper.

Space efficiency


In case of very small files (smaller than file system block size) database is more efficient

Backup performance

(-) The backup system records each single file with path / name, file time stamp, tape-id, time of backup.

Even in case of daily-differential and only 1x per week full-backup, still daily millions of files need to be checked if those have been changed since last backup.

Mitigation: "tar" of group of files – but then the file is not direct accessible any more, some program / script code required to identify file in tar-archive and untar

(+) - it is more efficient to back up a few large database files containing millions of documents

Restore performance after loss of data (file system corruption etc.)

(-) slow

(+) fast

Scalability:

Access from several application servers

using shared filesystems, e.g.

  • cluster file system provided by vendor of operating system cluster software

  • NTFS

  • Samba

performance impact and security constraints need to be considered

Database access from several servers is the typical, normal scenario.

compression

utilize file system feature "compress" (where available)

utilize database feature which automatically compresses data

Archiving (e.g. files older than 1 year)

usually on granularity of subdirectory- or filesystem level

utilizing "transportable tablespaces"

(pre-requisite: partitions of database tables stored in separated tablespaces)

When using tape as long term archive:

Restore from tape of a single file

(+) possible

(-) not possible – complete database or subset of database (e.g. tablespace) needs to be restored.

Costs – Licenses

Licenses for advanced file system options or cluster file systems

Database Licenses

Costs – Backup Costs

(-) higher

(+) lower

Copyright © 2005-2010 Mercury Consulting Limited.