Thursday, December 13, 2007

Hide Console Windows in Perl

Here's a nice snippet if you want to hide console windows that open up when calling utilities or commands using backticks. Perl, being an excellent duct-tape language, makes this easy, just add this snippet at the bottom of your code:

        if defined &Win32::SetChildShowWindow;

Sunday, October 14, 2007

Tip in using FileSystemWatcher class, make use of InternalBufferSize

There are a few gotchas in implementing FileSystemWatcher. One of them that I have recently encountered is the use of InternalBufferSize property. Previously, I built a windows service that watches for new files created in a directory. The files are then processed and the data uploaded to a sql server database. During my testing and initial deploy, I didn't see any issue nor did I think of any potential problems. Until one day I noticed that it is not picking up all the files. Some files are being left out at random. After some investigation, I found out that this is happening because of an overflow in the buffer that supports the FileSystemWatcher. The default setting for this property is only at 8K, imagine it has to keep track of all the events and info about the files. Approximately according to this, 8K can only hold info for as much as 80 files at one time. So for me, my solution is to increase the buffer size to 32K, which is 32768. The setting should be in multiples of 4K as noted in msdn.

Here's how I implemented the class:

public partial class NXPFileWatcher : ServiceBase


        FileSystemWatcher FSWatcher = null;


        public NXPFileWatcher()





        protected override void OnStart(string[] args)




                FSWatcher = new FileSystemWatcher();

                FSWatcher.Path = ConfigurationManager.AppSettings["WatchPath"];

                FSWatcher.InternalBufferSize = 32768;

                FSWatcher.NotifyFilter = NotifyFilters.CreationTime;

                FSWatcher.Filter = "*.xml";

                FSWatcher.Created += new FileSystemEventHandler(FSWatcher_Created);

                FSWatcher.EnableRaisingEvents = true;

                SendNotification("has Started");


            catch (Exception ex)






        protected override void OnStop()





                SendNotification("has Stopped");


            catch (Exception ex)






        protected override void OnPause()





        protected override void OnContinue()





        protected override void OnShutdown()





                SendNotification("has stopped due to system shutting down");


            catch (Exception ex)






        private void FSWatcher_Created(object sender, FileSystemEventArgs e)


            // Now open the file

            XmlFile xml = new XmlFile();

            xml.FullPath = e.FullPath;




Friday, September 28, 2007

List of Access Modifiers in C#

I have been programming in C# for more than 3 years now, always I forget the specifics on keywords like access modifiers. The most commonly used access modifiers are of course 'public' and 'private', and it's pretty obvious what they mean. But for example "protected internal", I just cannot recall on top of my head what it pertains to, so just as a note, here's all of them:

Access Modifier Meaning
public Available to all classes
protected Available only to the current class or to derived classes
internal Available only to classes in the current assembly
protected internal Available only to the current class, derived classes, or classes in the current assembly
private Available only to the containing class
virtual Creates a virtual method or property that can be overriden by a subclass

Monday, September 24, 2007

My Son's Playing in His First Piano Recital

He had been practicing at least once daily for the past few weeks to prepare for his first recital ever. His perseverance and hard work surely paid off. In his short program, he played two pieces back to back, Baby Face and Swing Town. My goodness, no tense, nor flustered he is, he just played it, beautifully. As for me, it was nerve wracking, but I made it. Just ecstatic! I'm proud of you son.

Friday, September 14, 2007

Hey There Delilah

Another song I like. By Plain White T's. There are songs that just strike your heartstrings. This is one of them. This song reminded me of when my then girlfriend had to leave to go to another country. I liked the song, first because of the lyrics, hmmm... a bit similar to the words that I said in my letters and phone calls to her. Hey, this guy is just simply retelling a segment of my life. Second, the melody of the song is just great, hauntingly folkish. Btw, just a little bit of trivia about this song, this was actually released in 2005. And now that the band had reached superstar (or superband?) status, some of their songs had been revisited, this song was one of them. It got re-released and reached the Billboard's top spot in July 2007. Now even considered as the biggest hit of the summer. Read more from here.

Hey there Delilah,
What's it like in New York City?
I'm a thousand miles away,
But girl tonight you look so pretty,
Yes you do,
Time Square can't shine as bright as you,
I swear it's true.

Hey there Delilah,
Don't you worry about the distance,
I'm right there if you get lonely,
Give this song another listen,
Close your eyes,
Listen to my voice it's my disguise,
I'm by your side.

Oh it's what you do to me,
Oh it's what you do to me,
Oh it's what you do to me,
Oh it's what you do to me,
What you do to me.

Hey there Delilah,
I know times are getting hard,
But just believe me girl some day,
I'll pay the bills with this guitar,
We'll have it good,
We'll have the life we knew we would,
My word is good.

Hey there Delilah,
I've got so much left to say,
If every simple song I wrote to you,
Would take your breath away,
I'd write it all,
Even more in love with me you'd fall,
We'd have it all.

Oh it's what you do to me,
Oh it's what you do to me,
Oh it's what you do to me,
Oh it's what you do to me.

A thousand miles seems pretty far,
But they've got planes and trains and cars,
I'd walk to you if I had no other way,
Our friends would all make fun of us,
And we'll just laugh along because,
We know that none of them have felt this way,
Delilah I can promise you,
That by the time that we get through,
The world will never ever be the same,
And you're to blame.

Hey there Delilah you be good,
And don't you miss me,
Two more years and you'll be done with school,
And I'll be making history,
Like I do,
You'll know it's all because of you,
We can do whatever we want to,
Hey there Delilah here's to you,
This one's for you.

Oh it's what you do to me,
Oh it's what you do to me,
Oh it's what you do to me,
Oh it's what you do to me,
What you do to me.


How to search all stored procedures in a db for a particular "text"

After migrating a large db from SQL Server 2000 to SQL Server 2005, one of the changes I needed to make was to switch from using xp_smtp_sendmail to sp_send_dbmail using the new "Database Mail" feature. Since I'm dealing with a db with at least a couple hundred stored procedures, I didn't want to manually look at each one. Here is a blogpost I found tackling this issue.

Summarizing, here's what I have used:

use mydb;

select routine_name, routine_definition
from information_schema.routines
where routine_definition like '%xp_smtp_sendmail%'
and routine_type = 'procedure'

Tuesday, September 04, 2007

SSIS problems with Excel Source

I'm working on an import to my databases from an Excel sheet. This should have been a breeze with SSIS if not because of issues with the Excel Source data flow, which is apparently because of Jet OLEDB. Let me describe the issue. When data is read from an Excel worksheet, the Jet samples the first few rows for the data type, the problem is that when it encounters a different one other than what it assumed at first, instead of raising an exception, it just simply drops the data and replaces it with NULL. And so, what happened at first, I thought that all data was imported in the database not knowing that some of them were NULL. Bad...bad...bad.

Anyhow, here's the solution, enjoy:

1. Going to the Excel Connection Manager, I changed the Connection Strings to: Provider=Microsoft.Jet.OLEDB.4.0;Data Source=H:\BillingReport\Upload.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";

I added IMEX=1

2. Went to the registry of where the package will run and changed: Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/TypeGuessRows to 0 (zero for maximum).

This solution came from these posts:


Thank you all!

Thursday, August 16, 2007

SQL Server 2005 Error: User or Role already exists in the current database

Just for me to remember. I'm currently doing migration of our applications and databases from the East Coast going to California. This is something that came up when I migrated one of my databases. I certainly would like to remember...

This is what happens when you restore the database to a new server, you expect that accounts could be configured to access the database via the "User Mapping". But instead there is this error message: "Error 15023: User or role already exists in the current database." Apparently, this is just a case of SID mismatch. Here's the solution:

SQL Server has a built in stored procedure to handle this issue:


Use the parameter 'Update_One' to update one SID:

USE YourDatabase

EXEC sp_change_users_login Update_One, 'yourLogin', 'yourLogin'

Go to this post for further reading.

Monday, June 18, 2007

Nothing Left to Lose

I just love this song by Matt Kearneythe very first time I heard it on the radio. Here's the lyrics of this song courtesy of splatr:

Something’s in the air tonight
The sky’s alive with a burning light
You can mark my words something’s about to break

And i found myself in a bitter fight
While I’ve held your hand through the darkest night
Don’t know where your coming from but your coming soon

To a kid from Oregon by way of California
All of this is more than I’ve ever known or seen

Come on and we’ll sing, like we were free
Push the pedal down watch the world around fly by us
Come on and we’ll try, one last time
I’m off of the floor one more time to find you

And here we go there’s nothing left to choose
And here we go there’s nothing left to lose

So I packed my car and headed east
Where I felt your fire and a sweet release
There’s a fire in these hills that’s coming down

And I don’t know much but i found you here
And I can not wait another year
Don’t know where your coming from but you coming soon

To a kid from Oregon by way of California
All of this is more than i’ve ever known or seen

Come on and we’ll sing, like we were free
Push the pedal down watch the world around fly by us
Come on and we’ll try, one last time
I’m off of the floor one more time to find you

And here we go there’s nothing left to choose
And here we go there’s nothing left to lose

I can still hear the trains out my window
From Hobart street to here in Nashville
I can still smell the pomegranates grow
And I don’t know how hard this wind will blow
Or where we’ll go

Come on and we’ll sing, like we were free
Push the pedal down watch the world around fly by us
Come on and we’ll try, one last time
I’m off of the floor one more time to find you

And here we go there’s nothing left to choose
And here we go there’s nothing left to lose

Tuesday, June 05, 2007

Object Oriented Analysis and Design

I'm reading this bookabout Object-Oriented Analysis and Design. These are some excerpts I would like to take note of:

The Role of Decomposition

"The technique of mastering complexity has been known since ancient times: divide et impera (divide and rule)". When designing a complex software system, it is essential to decompose it into smaller and smaller parts, each of which we may then refine independently. In this manner, we satisfy the very real constraint that exists on the channel capacity of human cognition: To understand any given level of a system, we need only comprehend a few parts (rather than all parts) at once. Indeed, as Parnas observes, intelligent decomposition directly addresses the inherent complexity of software by forcing a division of a system's state space.

The Role of Abstraction

Earlier, we referred to Miller's experiments, from which he concluded that an individual can comprehend only about seven, plus or minus two, chunks of information at one time. This number appears to be independent of information content. As Miller himself observes, "The span of absolute judgment and the span of immediate memory impose severe limitations on the amount of information that we are able to receive, process and remember. By organizing the stimulus input simultaneously into several dimensions and successively into a sequence of chunks, we manage to break ... this informational bottleneck". In contemporary terms, we call this process chunking or abstraction.

As Wulf describes it, "We (humans) have developed an exceptionally powerful technique for dealing with complexity. We abstract from it. Unable to master the entirety of a complex object, we choose to ignore its inessential details, dealing instead with the generalized, idealized model of the object". For example, when studying how photosynthesis works in a plant, we can focus on the chemical reactions in certain cells in a leaf and ignore all other parts, such as the roots and stems. We are still constrained by the number of things that we can comprehend at one time, but through abstraction, we use chunks of information with increasingly greater semantic content. This is especially true if we take an object-oriented view of the world because objects, as abstractions of entities in the real world, represent a particularly dense and cohesive clustering of information.

Elements of the Object Model

1. Abstraction
2. Encapsulation
3. Modularity
4. Hierarchy

By major, we mean that a model without any one of these elements is not object-oriented.

There are three minor elements of the object model:

1. Typing
2. Concurrency
3. Persistence

By minor, we mean that each of these elements is a useful, but not essential, part of the object model.


Abstraction focuses on the essential characteristics of some object, relative to the perspective of the viewer.


Encapsulation hides the details of the implementation of an object.

Abstraction and encapsulation are complementary concepts: Abstraction focuses on the observable behavior of an object, whereas encapsulation focuses on the implementation that gives rise to this behavior. Encapsulation is most often achieved through information hiding (not just data hiding), which is the process of hiding all the secrets of an object that do not contribute to its essential characteristics; typically, the structure of an object is hidden, as well as the implementation of its methods. "No part of a complex system should depend on the internal details of any other part". Whereas abstraction "helps people to think about what they are doing," encapsulation "allows program changes to be reliably made with limited effort".


Modularity packages abstractions into discrete units.


Abstractions form a hierarchy.


Strong typing prevents mixing of abstractions.


Concurrency allows different objects to act at the same time.


Persistence saves the state and class of an object across time or space.

Monday, May 07, 2007

Update Active Directory using Perl

Over the weekend, I received a phone call from my manager because there was an emergency to help out in a site migration that was happening somewhere in Texas. One of the things that was needed to be done is to import data from an Excel spreadsheet to a new active directory setup. Well, this is a pretty easy thing to do with perl.

Here's my solution:
#Program to import data to Active Directory
#by: Sam Dela Cruz
#date: 05/04/2007

use Win32;
use Win32::OLE qw(in);
use Win32::OLE::Enum;
use Spreadsheet::ParseExcel::Simple;

#list should be in the form of

my $data_source;
unless (@ARGV) {
    print "Enter input file: ";
    chomp ($data_source = );
else {
    $data_source = shift @ARGV;

my $xls = Spreadsheet::ParseExcel::Simple->read($data_source)
    or die "Cannot read $data_source: $!\n";

foreach my $sheet ($xls->sheets()){
    while ($sheet->has_data){
        my ($usb,
            $fax ) = $sheet->next_row;

        my $user;
        if($user = get_user($usb)){

            #Set Info
                                remove_before_after_space($ln)); #lastname
                                remove_before_after_space($fn)); #firstname
                                remove_before_after_space($mn)); #middlename
                                remove_before_after_space($org)); #organization
                                remove_before_after_space($location)); #location
                                remove_before_after_space($street)); #street
                                remove_before_after_space($city)); #city
                                remove_before_after_space($state)); #state
                                remove_before_after_space($zip)); #zip
                                remove_before_after_space($phone)); #phone
                                remove_before_after_space($mobile)); #mobile
                                remove_before_after_space($fax)); #fax

            unless ($User->SetInfo()){
                print "Uploaded,$usb,$ln,$fn,$mn,$org,$location,$street,$city,$state,$zip,$phone,$mobile,$fax\n";
                print "Cannot upload $usb info: ".Win32::OLE->LastError."\n";
            print "Unable to open $usb\n";

sub get_user {
    my $usb = shift;
    my $user;
    my $ldap = "LDAP://cn=$usb,ou=corporate,ou=users,ou=corporate,dc=acme,dc=acme,dc=com";

    if($user = Win32::OLE->GetObject($ldap)){
        return $user;
    return 0;

sub set_object_property {
    my ($object,$property,$value)=@_;
    if ($value) {
        print "Error: ".Win32::OLE->LastError."," if ($object->Put($property,$value));
    else {
        print "Error: ".Win32::OLE->LastError."," if ($object->PutEx(1,$property,$value));

sub remove_before_after_space {
    my $data = shift;
    $data =~ s/^\s+|\s+$//g;
    return $data;

There is just one gotcha. When setting up object property with the value of NULL or if you want to blank out that property, you will have to use a different method called PutEx with the ADS_PROPERTY_CLEAR = 1, thus:

Instead of


Friday, March 30, 2007

Access Lotus Notes using Perl via Win32::OLE

Are you a Lotus Notes user? Did you ever wonder how you can extract text out of Lotus Notes using Perl? Here's a script that I wrote to extract notification emails:

use Win32::OLE;

my $server;
my $database;
my $folder;

my $file = "temp_file.csv";

my $ini = "get_emails.ini";
if (-e $ini)
    open (INI,$ini) || die "Not able to open $ini: $!\n";
    chomp ($server = );
    chomp ($database = );
    chomp ($folder = );
    print "\nEnter Notes Server: ";
    chomp ($server=);
    print "\nEnter Notes Database: ";
    chomp ($database=);
    print "\nEnter Folder you want to access: ";
    chomp ($folder=);

#connect to the Notes database
my $Notes = Win32::OLE->new('Notes.NotesSession') || warn "Cannot start Lotus Notes Session object: $!\n";
my $Database = $Notes->GetDatabase($server, $database);

#Fetch contents of the folder
my $Response = $Database->GetView($folder);
my $Count = $Response->TopLevelEntryCount;
my $Index = $Count;

open (OUT, ">$file");

#loop through all emails
for (1..$Count)
    my $Document = $Response->GetNthDocument($Index--);
    my $subject = $Document->GetFirstItem('Subject')->{Text};
    my $body = $Document->GetFirstItem('Body')->{Text};
    print OUT "Subject: $subject\n",
              "Body: $body\n";

`start excel.exe $file`;

This works if you have Lotus Notes Client installed on your machine. As you can see the script uses OLE to access your mail database via the client. There is a way to connect to any Lotus Notes database without having to install Lotus Notes. It can be accessed using the ever popular Perl DBI module. The latter is the one that I use to synchronize Lotus Notes databases to Oracle, MySQL and SQL Server. Will discuss more about that... stay tuned.

Wednesday, March 07, 2007

Notes on Business Intelligence Solution

These are some practical tips taken from this book - Practical Business Intelligence with SQL Server 2005.

Suggested Approach: Build a consistent relational data warehouse with a dimensional schema optimized for queries.

Overall Solution Diagram:

Choose between Star or Snowflake for a Dimension? Answer: It Depends

  • Choose snowflake when the dimension's attributes come from different sources.
  • Snowflake when the dimension has a strong natural hierarchy. It is easier to manage the ETL process in this case.
Use a surrogate key for every dimension table. Surrogate keys are used as primary identifiers for all dimension tables in the data warehouse, and every fact table record that refers to a dimension always uses the surrogate key rather than the business key(primary key). All relationships in the data warehouse use the surrogate key, including the relationships between different dimension tables in a snowflake structure. Because the data warehouse uses surrogate keys and the source systems use business keys, this means that one important step in the ETL process is to translate the business keys in the incoming transaction records into data warehouse surrogate keys before inserting the new fact records.

Making a List of Candidate Attributes and Dimensions When you are reviewing the information you have collected, look for terms that represent different ways of looking at data. A useful rule of thumb is to look for words such as by(as in, "I need to see profitability by product category"). If you keep a list of all these candidate attributes when you find them, you can start to group them into probable dimensions such as Product or Customer.

One thing to be careful of is synonyms: People often have many different ways of naming the same thing, and it is rare that everyone will agree on the definition of every term. Similarly, people in different parts of the business could be using the same term to mean different things. An important job during the modeling process is to identify these synonyms and imprecise names and to drive the business users toward consensus on what terms will mean in the data warehouse. A useful by-product of this process can be a data dictionary that documents these decisions as they are made.

Making a List of Candidate Measures At the same time that you are recording the attributes that you have found, you will be looking for numeric measures. Many of the candidate measures that you find will turn out to be derived from a smaller set of basic measures, but you can keep track of all them because they might turn out to be useful calculations that you can add into the OLAP cube later. The best candidates for measures are additive and atomic. That is, they can be added up across all the dimensions, including time, and they are not composed from other measures.

Grouping the Measures with the Same Grain into Fact Tables

Figuring out how to group measures into fact tables is a much more structured process than grouping related attributes into dimension tables. The key concept that determines what measures end up on a fact table is that every fact table has only one grain. After you have your list of candidate measures, you can set up a spreadsheet as shown in the Table below with the candidate dimensions on the columns and the candidate measures on the rows.

Product Customer Date
Sales Amount SKU Customer Day
Quantity SKU Customer Day
Budget Amount Category N/A Month

For each measure, you need to figure out the grain or level of detail you have available. For example, for a specific sales amount from a sales transaction, you can figure out the customer that it was sold to, the product SKU that they bought, and the day that they made the purchase, so the granularity of the sales amount measure is Product SKU by Customer by Day. For budget amount, the business is only producing monthly budgets for each product category, so the granularity is Product Category by Month.

From the example in Table, we end up with two different fact tables. Because the Sales Amount and Quantity measures both have the same granularity, they will be on the Sales fact table, which will also include Product, Customer, and Date dimension keys. A separate Budget fact table will have Product Category and Date dimension keys and a Budget Amount measure.

Identifying the granularity of every measure sounds simple in principle but often turns out to be difficult in practice. So, how do you know when you have made a mistake? One common sign is when you end up with some records in the fact table with values for one set of measures and nulls for the remainder. Depending on how you load the data, you could also see that a given numeric quantity ends up being repeated on multiple records on a fact table. This usually occurs when you have a measure with a higher granularity (such as Product Category rather than Product SKU) than the fact table.

High Level Architecture

Most important dimensional modeling lesson: A single fact table must never contain measures at different levels of granularity.

Monday, March 05, 2007

Baby Ivana Visits Us!
This is a Flickr badge showing photos in a set called Ivana & Ara - Mar 2007. Make your own badge here.
Cute little baby Ivana. Our friends' first baby. Surely a delight to be visited. My daughter doesn't know how to react at first since she's always the youngest one in the house. Ivana's really a sweetie, she smiles in a whim, she likes my voice and laughs at my talking. She's just adorable!

Powerful Regular Expressions in Perl

Perl is rich with regular expressions. Most programs written in perl would normally have at least one regex for pattern matching. Having regular expressions actually ingrained directly into the language's core as well as into the minds of perl programmers, most tasks involving string manipulations are easily done in perl.

Well, last week I had an opportunity to answer one interesting question posted on ActivePerl forum. The question apparently has something to do with parsing a DNA sequence.

Here is the question:


Write a regex for filtering out the string between...
so in the above case, the output should be:
All combinations of start and end for AAA BBB CCC.

My solution:


use strict;
use warnings;

my @dna_tags      = ('AAA',
my $joined_tags   = join "|", @dna_tags;
my $tag_pattern   = qr($joined_tags); #regexp quoting mechanism
foreach ( $dna_sequence =~ /(?=($tag_pattern.*?$tag_pattern))/g ) {
    #do whatever on the captured string
    #in this case, I just want to print it out
    print $_, "\n";

The trick here is in the lookaround construct. Using positive lookahead specified with the special sequence (?=...), in the above snippet, (?=($tag_pattern.*?$tag_pattern)). Parentheses inside the construct grab the matched string. This is what is called zero-width look-ahead, since at this point, the match engine has not advanced at all. But because of the /g modifier, the engine notices that it has not moved forward so it advances one character position, then does the same match until it reaches the end of the string. So in this case, the effect is overlapping matches.

There is an excellent article about this same regular expressions constructs that can be found in in this post.

I would recommend reading these two books: Mastering Regular Expressions and Perl Cookbook. Perl Cookbook is actually where I got the basic solution on this problem.

Friday, February 23, 2007

A Cube Example

Again taken from msdn...

A cube is defined by its measures and dimensions. The measures and dimensions in a cube are derived from the tables and views in the data source view on which the cube is based, or which is generated from the measure and dimension definitions.

The Imports cube contains two measures, Packages and Last, and three related dimensions, Route, Source, and Time.

Cube Example 1

The smaller alphanumeric values around the cube are the members of the dimensions. Example members are ground (member of the Route dimension), Africa (member of the Source dimension), and 1st quarter (member of the Time dimension).


The values within the cube cells represent the two measures, Packages and Last. The Packages measure represents the number of imported packages, and the Sum function is used to aggregate the facts. The Last measure represents the date of receipt, and the Max function is used to aggregate the facts.


The Route dimension represents the means by which the imports reach their destination. Members of this dimension include ground, nonground, air, sea, road, or rail. The Source dimension represents the locations where the imports are produced, such as Africa or Asia. The Time dimension represents the quarters and halves of a single year.


Business users of a cube can determine the value of any measure for each member of every dimension, regardless of the level of the member within the dimension, because Analysis Services aggregates values at upper levels as needed. For example, the measure values in the preceding illustration can be aggregated according to a standard calendar hierarchy by using the Calendar Time hierachy in the Time dimension as illustrated in the following diagram.

Diagram of measures organized along time dimension

In addition to aggregating measures by using a single dimension, you can aggregate measures by using combinations of members from different dimensions. This allows business users to evaluate measures in multiple dimensions simultaneously. For example, if a business user wants to analyze quarterly imports that arrived by air from the Eastern Hemisphere and Western Hemisphere, the business user can issue a query on the cube to retrieve the following dataset.

Packages Last
All Sources Eastern Hemisphere Western Hemisphere All Sources Eastern Hemisphere Western Hemisphere
All Time 25110 6547 18563 Dec-29-99 Dec-22-99 Dec-29-99
1st half 11173 2977 8196 Jun-28-99 Jun-20-99 Jun-28-99
1st quarter 5108 1452 3656 Mar-30-99 Mar-19-99 Mar-30-99
2nd quarter 6065 1525 4540 Jun-28-99 Jun-20-99 Jun-28-99
2nd half 13937 3570 10367 Dec-29-99 Dec-22-99 Dec-29-99
3rd quarter 6119 1444 4675 Sep-30-99 Sep-18-99 Sep-30-99
4th quarter 7818 2126 5692 Dec-29-99 Dec-22-99 Dec-29-99

After a cube is defined, you can create new aggregations, or you can change existing aggregations to set options such as whether aggregations are precalculated during processing or calculated at query time.

Thursday, February 22, 2007

Why use UDM?

Continuing with my research about Data Warehousing, the following are some of my notes in Data Modeling. Taken from this tutorial in SQL Server from msdn.

UDM - Unified Dimensional Model

A user who wants to retrieve information directly from a data source, such as an Enterprise Resource Planning (ERP) database, faces several significant challenges:

  • The contents of such data sources are frequently very hard to understand, being designed with systems and developers instead of users in mind.
  • Information of interest to the user is typically distributed among multiple heterogeneous data sources. Even if dealing only with different relational databases, the user must understand the details of each, such as the dialect of SQL that is used. Worse, those data sources might be of very different types, including not only relational databases but files and Web services.
  • Whereas many data sources are oriented toward holding large quantities of transaction level detail, frequently the queries that support business decision-making involve summary, aggregated information. With increased data volumes, the time that is required to retrieve such summary values for interactive end-user analysis can be prohibitive.
  • Business rules are generally not encapsulated in the data sources. Users are left to make their own interpretation of the data.

The role of a Unified Dimensional Model (UDM) is to provide a bridge between the user and the data sources. A UDM is constructed over one or more physical data sources. The user issues queries against the UDM using a variety of client tools, such as Microsoft Excel.

Clients access all data sources through single UDM

There are advantages to the end user even when the UDM is constructed only as a thin layer over the data source: simpler, more easily understood model of the data, isolation from heterogeneous backend data sources, and improved performance for summary type queries. In some scenarios, a simple UDM can be constructed automatically. Greater investment in the construction of the UDM can generate additional benefits that accrue from the richness of metadata that the model can provide.

The UDM provides the following benefits:

  • Greatly enriches the user model.
  • Provides high performance queries supporting interactive analysis, even over large data volumes.
  • Captures business rules in the model to support richer analysis.
  • Supports ‘closing the loop’: letting users can act upon the data they see.

Monday, February 19, 2007

Cool Snow in Bear Valley
This is a Flickr badge showing photos in a set called Bear Valley Feb 2007. Make your own badge here.
We just got back from a three hour drive from Angel's Camp where we stayed for 2 nights. Bear Valley was just an hour away from where we stayed. Driving to highway 4 going to north, one would not think that there's lots of snow once you get on the top of the mountains. Bear Valley is so nice, we particularly enjoyed the not so angled slopes and vast stretch of snow, well suited for our 3 and 4 year old kids. My son who's 9 years old, especially enjoyed the 2 sledding hills. He got to the hills countless times tirelessly. I asked my daughter if she really enjoyed it, she said "Yes" and what she enjoyed the most, well "making snowballs".

Thursday, February 15, 2007

Five Reasons Why I Sleep Past Midnight Everyday

This is in response to a contest called "Five Things" hosted by one of my favorite blogs //engtech. So without further ado, here's my "five things":

Five Reasons Why I Sleep Past Midnight Everyday:
  1. So I can do things after putting my children to bed.
  2. So I can catch up on my reading and study materials.
  3. So I can surf the net.
  4. So I can have quiet time for myself.
  5. I have Insomnia, which is probably the result of the first 4 reasons mentioned.

Now that I'm thinking about it, the result could also be the reason why I do the first 4. It's a cycle you know, it's a never ending thing.

Wednesday, February 14, 2007

Kill Processes in Windows using Perl

When doing install automation, sometimes, it is necessary to disable some processes before proceeding. This can be done easily using perl's kill() function. But sometimes some processes are hard to kill, so what I have been using is this script that I borrowed from Dave Roth in his book Win32 Perl Scripting.

Here's how I do it:

use SamTools::Process;

#Kill Anti-Spywares

sub kill_process {
    my $process = shift;
    my $result = 0;

    my $pid = SamTools::Process::getpid($process);

    if ($pid){
        if (SamTools::Process::kill_pid($pid)){
            $log->message("Process=Successfully killed $process\n");
            $log->exception("Process=Cannot kill $process\n");
        $log->message("Process=$process not found\n");

I modularized it below:

package SamTools::Process;

use Win32::PerfLib;
use Win32::API;
use Win32::Process::Info;

sub getpid {
    my $process = shift;

    my $server = $ENV{COMPUTERNAME};
    my $pid;

    Win32::PerfLib::GetCounterNames($server, \%counter);
    %r_counter = map { $counter{$_} => $_ } keys %counter;
    $process_obj = $r_counter{Process};
    $process_id = $r_counter{'ID Process'};
    $perflib = new Win32::PerfLib($server) || return 0;
    $proc_ref = {};
    $perflib->GetObjectList($process_obj, $proc_ref);
    $instance_ref = $proc_ref->{Objects}->{$process_obj}->{Instances};
    foreach $p (sort keys %{$instance_ref}){
        $counter_ref = $instance_ref->{$p}->{Counters};
        foreach $i (keys %{$counter_ref}){
            if($counter_ref->{$i}->{CounterNameTitleIndex} ==
               $process_id && $instance_ref->{$p}->{Name} eq $process){
                $pid = $counter_ref->{$i}->{Counter};

    #try again using a different approach WMI
    unless ($pid){
        if (my $pi = Win32::Process::Info->new($server)){
            my $processes = $pi->GetProcInfo();
            my $number = @$processes;
            foreach (@$processes){
                if ($_->{Name} =~ /$process/i){
                    $pid = $_->{ProcessId};

    $pid?return $pid:return 0;

sub pidalive {
    my $pid = shift;
    my $server = $ENV{COMPUTERNAME};

    Win32::PerfLib::GetCounterNames($server, \%counter);
    %r_counter = map { $counter{$_} => $_ } keys %counter;
    $process_obj = $r_counter{Process};
    $process_id = $r_counter{'ID Process'};
    $perflib = new Win32::PerfLib($server) || return 0;
    $proc_ref = {};
    $perflib->GetObjectList($process_obj, $proc_ref);
    $instance_ref = $proc_ref->{Objects}->{$process_obj}->{Instances};
    foreach $p (sort keys %{$instance_ref}){
        $counter_ref = $instance_ref->{$p}->{Counters};
        foreach $i (keys %{$counter_ref}){
            if ($counter_ref->{$i}->{Counter} == $pid){
                return $pid;
    return 0;

sub kill {
    my $process = shift;
    my $pid = getpid($process);
    if ($pid){
        $iResult = ForceKill( $pid );
        return 1 if( $iResult );
    return 0;

sub kill_pid {
    my $pid = shift;
    $iResult = ForceKill( $pid );
    return 1 if( $iResult );
    return 0;

sub ForceKill {
    my( $Pid ) = @_;
    my $iResult = 0;
    my $phToken = pack( "L", 0 );
    # Fetch the process's token
                               $TOKEN_ADJUST_PRIVILEGES | $TOKEN_QUERY,
                               $phToken )){
        my $hToken = unpack( "L", $phToken );
        # Set the debug privilege on the token
        if( SetPrivilege( $hToken, $SE_DEBUG_NAME, 1 ) ){
            # Now that we have debug privileges on the process
            # open the process so we can mess with it.
            my $hProcess = $OpenProcess->Call( $PROCESS_TERMINATE, 0, $Pid );
            if( $hProcess ){
                # We no longer need the debug privilege since we have opened
                # the process so remove the privilege.
                SetPrivilege( $hToken, $SE_DEBUG_NAME, 0 );
                # Let's termiante the process
                $iResult = $TerminateProcess->Call( $hProcess, 0 );
                $CloseHandle->Call( $hProcess );
        $CloseHandle->Call( $hToken );
    return $iResult;

sub SetPrivilege {
    my( $hToken, $pszPriv, $bSetFlag ) = @_;
    my $pLuid = pack( "Ll", 0, 0 );
    # Lookup the LIUD of the privilege
    if( $LookupPrivilegeValue->Call( "\x00\x00", $pszPriv, $pLuid ) ){
        # Unpack the LUID
        my $pPrivStruct = pack( "LLlL",
                               unpack( "Ll", $pLuid ),
                               ( ( $bSetFlag )? $SE_PRIVILEGE_ENABLED : 0 )
        # Now modify the process's token to set the required privilege
        $iResult = ( 0 != $AdjustTokenPrivileges->Call( $hToken,
                                                       length( $pPrivStruct ),
                                                       0 )

    return $iResult;

sub Configure {
    $TOKEN_QUERY             = 0x0008;
    $SE_PRIVILEGE_ENABLED    = 0x02;
    $PROCESS_TERMINATE       = 0x0001;
    $SE_DEBUG_NAME           = "SeDebugPrivilege";

    # Prepare to use some specialized Win32 API calls
    $GetCurrentProcess     = new Win32::API( 'Kernel32.dll',
                                             N ) || die;
    $OpenProcessToken      = new Win32::API( 'AdvApi32.dll',
                                             [N,N,P], I ) || die;
    $LookupPrivilegeValue  = new Win32::API( 'AdvApi32.dll',
                                             [P,P,P], I ) || die;
    $AdjustTokenPrivileges = new Win32::API( 'AdvApi32.dll',
                                             I ) || die;
    $OpenProcess           = new Win32::API( 'Kernel32.dll',
                                             N ) || die;
    $TerminateProcess      = new Win32::API( 'Kernel32.dll',
                                             I ) || die;
    $CloseHandle           = new Win32::API( 'Kernel32.dll',
                                             I ) || die;



=head1 NAME

SamTools::Process - Sam's Process controllers.


    use SamTools::Process;


This module controls processes on Win32 Systems.  Currently, only has killing capability.

use SamTools::Process;

=head1 kill