What do Clustered and Non clustered index actually mean?

[Origin]: https://stackoverflow.com/questions/1251636/what-do-clustered-and-non-clustered-index-actually-mean

I have a limited exposure to DB and have only used DB as an application programmer. I want to know about Clustered and Non clustered indexes. I googled and what I found was :

A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages. A nonclustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a nonclustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.

What I found in SO was What are the differences between a clustered and a non-clustered index?.

Can someone explain this in plain English?

With a clustered index the rows are stored physically on the disk in the same order as the index. Therefore, there can be only one clustered index.

With a non clustered index there is a second list that has pointers to the physical rows. You can have many non clustered indexes, although each new index will increase the time it takes to write new records.

It is generally faster to read from a clustered index if you want to get back all the columns. You do not have to go first to the index and then to the table.

Writing to a table with a clustered index can be slower, if there is a need to rearrange the data.

In SQL Server row oriented storage both clustered and nonclustered indexes are organized as B trees.

enter image description here

(Image Source)

The key difference between clustered indexes and non clustered indexes is that the leaf level of the clustered index is the table. This has two implications.

  1. The rows on the clustered index leaf pages always contains something for each of the (non sparse) columns in the table (either the value, or a pointer to the actual value).
  2. The clustered index is the primary copy of a table.

Non clustered indexes can also do point 1 by using the INCLUDE clause (Since SQL Server 2005) to explicitly include all non key columns but they are secondary representations and there is always another copy of the data around (the table itself).

CREATE TABLE T
(
A INT,
B INT,
C INT,
D INT
)

CREATE UNIQUE CLUSTERED INDEX ci ON T(A,B)
CREATE UNIQUE NONCLUSTERED INDEX nci ON T(A,B) INCLUDE (C,D)

The two indexes above will be nearly identical. With the upper level index pages containing values for the key columns A,B and the leaf level pages containing A,B,C,D

There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.

The above quote from SQL Server books online causes much confusion

In my opinion it would be much better phrased as.

There can be only one clustered index per table, because the leaf level rows of the clustered index are the table rows.

The books online quote is not incorrect but you should be clear that the “sorting” of both non clustered and clustered indices is logical not physical. If you read the pages at leaf level by following the linked list and read the rows on the page in slot array order then you will read the index rows in sorted order but physically the pages may not be sorted. The commonly held belief that with a clustered index the rows are always stored physically on the disk in the same order as the index keyis false.

This would be an absurd implementation. For example if a row is inserted into the middle of a 4GB table SQL Server does not have to copy 2GB of data up in the file to make room for the newly inserted row .

Instead a page split occurs. Each page at the leaf level of both clustered and non clustered indexes has the address (File:Page) of the next and previous page in logical key order. These pages need not be either contiguous or in key order.

e.g. the linked page chain might be 1:2000 <-> 1:157 <-> 1:7053

When a page split happens a new page is allocated from anywhere in the filegroup (from either a mixed extent, for small tables, or a non empty uniform extent belonging to that object or a newly allocated uniform extent). This might not even be in the same file if the file group contains more than one.

The degree to which the logical order and contiguity differs from the idealised physical version is the degree of logical fragmentation.

In a newly created database with a single file I ran the following.

CREATE TABLE T
  (
     X TINYINT NOT NULL,
     Y CHAR(3000) NULL
  );

CREATE CLUSTERED INDEX ix
  ON T(X);

GO

--Insert 100 rows with values 1 - 100 in random order
DECLARE @C1 AS CURSOR,
        @X  AS INT

SET @C1 = CURSOR FAST_FORWARD
FOR SELECT number
    FROM   master..spt_values
    WHERE  type = 'P'
           AND number BETWEEN 1 AND 100
    ORDER  BY CRYPT_GEN_RANDOM(4)

OPEN @C1;

FETCH NEXT FROM @C1 INTO @X;

WHILE @@FETCH_STATUS = 0
  BEGIN
      INSERT INTO T (X)
      VALUES        (@X);

      FETCH NEXT FROM @C1 INTO @X;
  END

Then checked the page layout with

SELECT page_id,
       X,
       geometry::Point(page_id, X, 0).STBuffer(1)
FROM   T
       CROSS APPLY sys.fn_PhysLocCracker( %% physloc %% )
ORDER  BY page_id

Results were all over the place. The first row in key order (with value 1 – highlighted with arrow below) was on nearly the last physical page.

enter image description here

Fragmentation can be reduced or removed by rebuilding or reorganising an index to increase the correlation between logical order and physical order.

After running

ALTER INDEX ix ON T REBUILD;

I got the following

enter image description here

If the table has no clustered index it is called a heap.

Non clustered indexes can be built on either a heap or a clustered index. They always contain a row locator back to the base table. In the case of a heap this is a physical row identifier (rid) and consists of three components (File:Page:Slot). In the case of a Clustered index the row locator is logical (the clustered index key).

For the latter case if the non clustered index already naturally includes the CI key column(s) either as NCI key columns or INCLUDE-d columns then nothing is added. Otherwise the missing CI key column(s) silently get added in to the NCI.

SQL Server always ensures that the key columns are unique for both types of index. The mechanism in which this is enforced for indexes not declared as unique differs between the two index types however.

Clustered indexes get a uniquifier added for any rows with key values that duplicate an existing row. This is just an ascending integer.

For non clustered indexes not declared as unique SQL Server silently adds the row locator in to the non clustered index key. This applies to all rows, not just those that are actually duplicates.

The clustered vs non clustered nomenclature is also used for column store indexes. The paper Enhancements to SQL Server Column Stores states

Although column store data is not really “clustered” on any key, we decided to retain the traditional SQL Server convention of referring to the primary index as a clustered index.

Advertisements

Emacs – Error when calling (server-start)

[Origin]: https://stackoverflow.com/questions/885793/emacs-error-when-calling-server-start

I am currently using GNU Emacs 23.0.93.1 in Windows Vista SP1. In my .emacs file I make a call to (server-start) and that is causing an error with the message The directory ~/.emacs.d/server is unsafe. Has anyone seen this and know a fix or workaround? … other than leaving server turned off 😉

Here is the stack trace:

Debugger entered--Lisp error: (error "The directory ~/.emacs.d/server is unsafe")
  signal(error ("The directory ~/.emacs.d/server is unsafe"))
  error("The directory %s is unsafe" "~/.emacs.d/server")
  server-ensure-safe-dir("~\\.emacs.d\\server\\")
  server-start(nil)
  call-interactively(server-start t nil)
  execute-extended-command(nil)
  call-interactively(execute-extended-command nil nil)

I found this solution on EmacsWiki:

“The problem is the ownership of the directory ~/.emacs.d/server when you also have “Administrators” rights on your account. Create the directory ~/.emacs.d/server and set the owner of this directory to your login name and the problem is gone. As I have a “Dutch” version of Windows 7 I don’t know the English terms exactly but here’s the procedure:

Click R-mouse on ~/.emacs.d/server and select “Properties” (last item in menu). From Properties select the Tab “Security” and then select the button “Advanced”. Then select the Tab “Owner” and change the owner from Administrators (<your-pc-name>\Administrators) into <your-login-name> (<your-pc-name>\<your-login-name>. Now the server code will accept this directory as secure because you are the owner.

Hope this helps for all you guys, it solved the problem for me anyway.

W.K.R. Reutefleut”

It definitely works on Vista, with Emacs 23.2.1.

Lock, mutex, semaphore… what’s the difference?

[Origin]: https://stackoverflow.com/questions/2332765/lock-mutex-semaphore-whats-the-difference

A lock allows only one thread to enter the part that’s locked and the lock is not shared with any other processes.

A mutex is the same as a lock but it can be system wide (shared by multiple processes).

A semaphore does the same as a mutex but allows x number of threads to enter, this can be used for example to limit the number of cpu, io or ram intensive tasks running at the same time.

You also have read/write locks that allows either unlimited number of readers or 1 writer at any given time.

There are a lot of misconceptions regarding these words.This is from a previous post (https://stackoverflow.com/a/24582076/3163691) which fits superb here:

1) Critical Section= User object used for allowing the execution of just one active thread from many others within one process. The other non selected threads are put to sleep.

[No interprocess capability, very primitive object].

2) Mutex Semaphore (aka Mutex)= Kernel object used for allowing the execution of just one active thread from many others, among different processes. The other non selected threads are put to sleep. This object supports thread ownership, thread termination notification, recursion (multiple ‘acquire’ calls from same thread) and ‘priority inversion avoidance’.

[Interprocess capability, very safe to use, a kind of ‘high level’ synchronization object].

3) Counting Semaphore (aka Semaphore)= Kernel object used for allowing the execution of a group of active threads from many others. The other non selected threads are put to sleep.

[Interprocess capability however not very safe to use because it lacks following ‘mutex’ attributes: thread termination notification, recursion?, ‘priority inversion avoidance’?, etc].

4) And now, talking about ‘spinlocks’, first some definitions:

Critical Region= A region of memory shared by 2 or more processes.

Lock= A variable whose value allows or denies the entrance to a ‘critical region’. (It could be implemented as a simple ‘boolean flag’).

Busy waiting= Continuosly testing of a variable until some value appears.

Finally:

Spin-lock (aka Spinlock)= A lock which uses busy waiting. (The acquiring of the lock is made by xchg or similar atomic operations).

[No thread sleeping, mostly used at kernel level only. Ineffcient for User level code].

As a last comment, I am not sure but I can bet you some big bucks that the above first 3 synchronizing objects (#1, #2 and #3) make use of this simple beast (#4) as part of their implementation.

Have a good day!.

References:

-Real-Time Concepts for Embedded Systems by Qing Li with Caroline Yao (CMP Books).

-Modern Operating Systems (3rd) by Andrew Tanenbaum (Pearson Education International).

-Programming Applications for Microsoft Windows (4th) by Jeffrey Richter (Microsoft Programming Series).

Also, you can take a look at look at: https://stackoverflow.com/a/24586803/3163691

What’s the meaning of require: ‘ngModel’?

[Origin]: https://stackoverflow.com/questions/20930592/whats-the-meaning-of-require-ngmodel

This is the HTML for my directive:

<textarea data-modal="modal" data-mydir ng:model="abc"></textarea>

In my directive I have this:

return {
        require: 'ngModel',
        replace: true,
        scope: {
            modal: '=modal',
            ngModel: '=',
            pid: '=pid'
        },

Can someone tell me, what’s the significance of require: ‘ngModel’ ? I see this in many different directives. Could I call this data-modal?

I am confused because when I change it to data-modal I get a message from Angular saying

Controller 'ngModel', required by directive 'textarea', can't be found!

Delete file containing invalid characters in windows

[Origin]: https://serverfault.com/questions/95581/delete-file-containing-invalid-characters-in-windows

I have some files that contain colon character in the filename (eg. 1d67c0d23e859ed4a259749e4a720d9e:default-sink). When I try to remove them from command line with the command:

c:\backup> del /F *.*

I get “The system cannot find the file specified.” for each file. If I specify an individual file:

c:\backup> del /F "1d67c0d23e859ed4a259749e4a720d9e:default-sink"

I get “The filename, directory name, or volume label syntax is incorrect.”. If I try to use rd on the parent folder I get

How can I remove these files?

This worked for me:

  1. Ran chkdsk /f
  2. Rebooted PC
  3. Then I was able to select and delete the file

Close all the files, if any opened, that are saved in the drive which is containing such files. Now, Open the drive properties.

“My Computer” -> “Right Click on the this drive” > properties

Then, go to “Tools” tab and click “Check now” under ‘Error checking’ label.

enter image description hereClick here for larger image

This will unmount your drive and will scan the complete drive for errors, eventually it will delete such files with illegal names. On finishing it will show a messages saying some files were fixed.

I successfully deleted such files from my external hard disk.

C brace indentation

[Origin]: https://emacs.stackexchange.com/questions/22673/c-brace-indentation

I am new to Emacs and trying to learn the basics.

I am writing a few C programs and I noticed that the default brace indentation is as follows:

for(i = 0; i &lt; 10; ++i)
    {
        // code 
    }

How would I go about modifying this behaviour to achieve the following instead:

for(i = 0; i &lt; 10; ++i)
{
    // code
}

As mentioned in this answer on SO, you can do:

(setq c-default-style "bsd")

This will set the style for all C-based modes. If you want to set it only for one, do e.g.:

(add-to-list 'c-default-style '(c-mode "bsd"))

However, setting c-default-style will change various other style settings, which may not be what you want. To only change the post-for-loop-brace indentation, do:

(add-to-list 'c-offsets-alist '(substatement-open . 0))

(This will shadow the old value, which AFAIK shouldn’t cause any problems. See this question and its answers for possible ways to actually replace the old value.)

Now to generalize: how do you find the specific syntactic symbol in the c-offsets-alist that you need to modify?

Note the line that has faulty indentation (in this case, the one with the { after the for loop), and move the point to that line. Then do:

M-x c-show-syntactic-information

(or C-c C-s). This will give you e.g.:

Syntactic analysis: ((substatement-open 16))

Another option is to do M-x c-set-offset (or C-c C-o), which will give you the following prompt:

Syntactic symbol to change: |substatement-open

Here the | marks the position of the cursor, with the relevant symbol auto-filled for you.

Thanks to @nispio for the info!

Tried to Load Angular More Than Once

[Origin]: https://stackoverflow.com/questions/22595878/tried-to-load-angular-more-than-once

This could be a number of issues: essentially it’s a problem of routeProvider not finding a file and recursively loading the default.

For me, it turned out that it wasn’t minification but concatenation of the js that caused the problems.

angular.module('myApp').config(['$routeProvider', function ($routeProvider) {
    $routeProvider
      .when('/', {
        templateUrl: 'views/listing.html',
        controller: 'ListingCtrl'
      })
      .otherwise({
        redirectTo: '/'
      });
  }]).constant('FIREBASE_URL', 'something');

You’ll notice that if the app can’t find a file (i.e., otherwise), then it will redirect to the root, which in this case loads the templateUrl. But if your templateUrl is wrong, then it will cause a recursion that reloads index.html loading angular (and everything else) over and over.

In my case, grunt-concat caused the templateUrl to be wrong after build, but not before.

I had the same issue, The problem was the conflict between JQuery and Angular. Angular couldn’t set the full JQuery library for itself. As JQLite is enough in most cases, I included Angular first in my web page and then I loaded Jquery. The error was gone then.