Automated Workflow Environments and EMR

October 30, 2006

Well, we work in the next era of software development, not only designing applications, but also developing systems that communicate with each other, thus participating in a workflow.

Automating this workflow through the seamless integration of these apps is a task that challenges many of the industries that we work in.

Automated Workflow Environments are those systems where multiple systems contribute and communicate to enable a network of these apps to actually solve complex problems very efficiently, with no human interaction. You can call them Digital Ecosystems.

You can construct workflow nets to describe the complex problems that these systems efficiently solve. Workflow nets, a subclass of Petri nets, are known as attractive models for analyzing complex business processes. Because of their good theoretical foundation, Petri nets have been used successfully to model and analyze processes from many domains, like for example, software and business processes. A Petri net is a directed graph with two kinds of nodes – places and transitions – where arcs connect ‘a place’ to ‘a transition’ or a transition to a place. Each place can contain zero, one or more tokens. The state of a Petri net is determined by the distribution of tokens over places. A transition can fire if each of its inputs contains tokens. If the transition fires, i.e. it executes, it takes one token from each input place and puts it on each output place.

In a hospital environment, for example, the processes involved, show a complex and dynamic behavior, which is difficult to control. The workflow net which models such a complex process provides a good insight into it, and due to its formal representation, offers techniques for improved control.

Workflows are case oriented, which means that each activity executed in the workflow corresponds to a case. In a hospital domain, a case corresponds with a patient and an activity corresponds with a medical activity. The process definition of a workflow assumes that a partial order or sequence exists between activities, which establish which activities have to be executed in what order. Referring to the Petri net formalism, workflow activities are modeled as transitions and the causal dependencies between activities are modeled as places and arcs. The routing in a workflow assumes four kind of routing constructs: sequential, parallel, conditional and iterative routing. These constructs basically define the route taken by ‘tokens’ in this workflow.

Well, enough theory, how does this apply?

Think of this in practical terms using the example of a EMR* or CPR* System or HIS* System:
• A patient arrives at a hospital for a consultation or particular set of exams or procedures.
• The patient is registered, if new to the hospital. A visit or encounter record is created in the Patient Chart (EMR) – with vitals, allergies, current meds and insurance details.
• The physician examines the patient and orders labs, diagnostic exams or prescription medications for the patient possibly using a handheld CPOE*
• The patient is scheduled for the exams in the RIS – radiology info system or LIS – laboratory info system or HIS (hospital info system)
• The RIS or LIS or HIS sends notifications to the Radiology and/or Cardiology and/or Lab or other Departments in the hospital through HL7 messages for the various workflows.
• The various systems in these departments will then send HL7 or DICOM or proprietary messages to get the devices or modalities, updated with the patient data (prior history, etc.)
• The patient is then taken around by the nurses to the required modalities in the exam/LAB areas to perform the required activities.
• The patient finishes the hospital activities while the diagnosis continues and the entire data gathered is coalesced and stored in rich structured report or multimedia formats in the various repositories – resulting in a summary patient encounter/visit record in the Electronic Patient Record in the EMR database.
• There could also be other workflows triggered – pharmacy, billing,.
• The above is just the scenario for an OUTPATIENT, there are other workflows for INPATIENT – ED/ICU/other patients.

The key problems in this ‘Automated Workflow Environment’ are:

• Accurate Patient Identification and Portability to ensure that the Patient Identity is unique across multiple systems/departments and maybe hospitals. The Patient Identity key is also essential to Integrating Patient healthcare across clinics, hospitals, regions(RHIO) and states.
• Support for Barcode/RFID on Patient Wrist Bands, Prescriptions/Medications, Billing (using MRN, Account Number, Order Number,Visit Number), etc to enable automation and quick and secure processing.
• Quick Patient data retrieval and support for parallel transactions
Audits and Logs for tracking access to this system
• Support for PACS, Emergency care, Chronic care (ICU / PACU), Long Term care, Periodic visits, point of care charting, meds administration, vital signs data acquisition, alarm notification, surveillance for patient monitors, smart IV pumps, ventilators and other care areas – treatment by specialists in off-site clinics, etc.
• Support for Care Plans, Order sets and Templates, results’ tracking and related transactions.
• Quick vital sign results and diagnostic reporting
• Effective display of specialty content – diagnostic/research images, structured “rich” multimedia reports.
Secure and efficient access to this data from the internet
Removal of paper documentation and effective transcription
SSO-Single Sign On, Security roles and Ease of use for the various stakeholders – here, the patient, the RN, physician, specialist, IT support etc.
Seamless integration with current workflows and support for updates to hospital procedures
Modular deployment of new systems and processes – long term roadmap and strategies to prevent costly upgrades or vendor changes.
HIPAA, JCAHO and Legal compliance – which has an entire set of guidelines – privacy, security being the chief one.
• Efficient standardized communication between the different systems either via “standard” HL7 or DICOM or CCOW or proprietary.
• Support for a High speed Fiber network system for high resolution image processing systems like MRI, X-Ray, CT-SCAN, etc.
• A high speed independent network for real time patient monitoring systems and devices
• Guaranteed timely Data storage and recovery with at least 99.9999% visible uptime
• Original Patient data available for at least 7 years and compliance with FDA rules.
Disaster recovery compliance and responsive Performance under peak conditions.
• Optimized data storage ensuring low hardware costs
Plug ‘n’ Play of new systems and medical devices into the network, wireless communication among vital signs devices and servers, etc.
Location tracking of patients and devices (RFID based) and Bed Tracking in the hospital
Centralized viewing of the entire set of Patient data – either by a patient or his/her physician
Multi-lingual user interface possibilities (in future?)
Correction of erroneous data and merging of Patient records.
Restructuring existing hospital workflows and processes so that this entire automated workflow environment works with a definite ROI and within a definite time period!
• Integration with billing, insurance and other financial systems related to the care charges.
Future proof and support for new technologies like Clinical Decision Support (CDSS) – again a long term roadmap is essential.

ROI: How does a hospital get returns on this IT investment?

  1. Minimization of errors – medication or surgical – and the associated risks
  2. Electronic trail of patient case history available to patient, insurance and physicians
  3. Reduced documentation and improvement in overall efficiency and throughput
  4. Patient Referrals from satellite clinics who can use the EMR’s external web links to document on patients – thus providing a continuous electronic report
  5. Possible pay-per-use by external clinics – to use EMR charting facilities
  6. Remote specialist consultation
  7. Efficient Charges, Billing and quicker settlements
  8. Better Clinical Decision Support – due to an electronic database of past treatments
  9. In the long term, efficiency means cheaper insurance which translates to volume income
  10. Better compliance of standards – HIPAA, privacy requirements, security
  11. Reduced workload due to Process Improvement across departments – ED, Obstetrics/Gynecology, Oncology/Radiology, Orthopedic, Cardiovascular, Pediatrics, Internal Medicine, Urology, General Surgery, Ophthalmology, General/family practice, Dermatology, Psychiatry
  12. Improved Healthcare with Proactive Patient Care due to CDSS
  13. Quality of Patient Care: A silent factor of a hospital’s revenue is quality of patient care. One of the chief drivers of quality of patient care is the quality of information provided efficiently to the Physicians though which they can make those critical decisions

Now, the big picture becomes clear.

Doesn’t the above set of requirements apply to any domain? This analysis need not be applicable only to a hospital domain, the same is true for a Biotech domain (where orders are received, data is processed, analyzed, and the processed data is presented or packaged). Similarly a Manufacturing Domain, Banking domain or Insurance Domain etc.

The need is for core engine software – based on EDI (Electronic Data Interchange) – that integrate and help in the Process Re-Engineering of these mini workflows securely and effectively and using common intersystem communication formats like X-12 or HL7 messages.

These Workflow Engines would be the hearts of the digital world!

*EMR – Electronic Medical Record
*CPR – Computerized Patient Record
*CDSS – Clinical Decision Support
*RHIO – Regional Health Information Organization
*CPOE – computerized physician order entry

Some of the information presented here is thanks to research papers and articles at:
*Common Framework for health information networks
*Discovery of Workflow Models for Hospital Data
*Healthcare workflow
*CCOW-IHE Integration Profiles
*Hospital Network Management Best Practices
*12 Consumer Values for your wall

What about the latest IT trends and their applications in healthcare?

We already know about Google Earth and Google Hybrid Maps and the advantages of Web 2.0
The next best thing is to search the best shopping deal or the best real estate by area and on a hybrid map – this recombinant web application reuse technique is called a mashup or heat map.
Mashups have applications in possibly everything from Healthcare to Manufacturing.
Omnimedix is developing and deploying a nationwide data mashup – Dossia, a secure, private, independent network for capturing medical information, providing universal access to this data along with an authentication system for delivery to patients and consumers.

Click on the below links to see the current ‘best in class mash ups
*After hours Emergency Doctors SMS After hours Emergency Doctors SMS system – Transcribes voicemail into text and sends SMS to doctors. A similar application can be used for Transcription Mashup (based on Interactive Voice Response – IVR): Amazon Mturk, StrikeIron Global SMS and Voice XML
* Calendar with Messages Listen to your calendar + leave messages too Mashup (based on IVR): 30 Boxes based on Voxeo , Google Calendar
* – Housing/Climate/Jobs/Schools
* Visual Classifieds Browser – Search Apartments, visually
* – Real Estate/Home pricing
* – Rent comparison
* – Real Estate Statistical Analysis
* – Rent/Real Estate/Home pricing – linked to Craigslist
* – Google Maps + Travel Videos
* – Wheel of Zip Code based restaurants
* More sample links at this site (unofficial Google mashup tracker) includes some mentionable sites :
* latest news from India by map
* read news by the map – slightly slow
* view news from Internet TV by map –
* see a place in 360

What’s on the wish list ? Well, a worldwide mashup for real estate, shopping, education, healthcare will do just fine. Read on to try out YOUR sample…
OpenKapow: The online mashup builder community that lets you easily make mashups. Use their visual scripting environment to create intelligent software Robots that can make mashups from any site with or without an API.
In the words of Dion HinchCliffe, “Mashups are still new and simple, just like PCs were 20 years ago. The tools are barely there, but the potential is truly vast as hundreds of APIs are added to the public Web to build out of”.
Don also covers the architecture and types of Mashups here with an update on recombinant web apps

Keep up to date on web2.0 at

Will Silverlight and simplified vector based graphics and workflow based – xml language – XAML be the replacement for Flash and JavaFX?

Well, the technology is promising and many multimedia content web application providers including News channels have signed up for Microsoft SilverLight “WPF/E” due to the light weight browser based viewer streaming “DVD” quality video based on the patented VC-1 video codec.

Microsoft® Silverlight™ Streaming by Windows Live™ is a companion service for Silverlight that makes it easier for developers and designers to deliver and scale rich interactive media apps (RIAs) as part of their Silverlight applications. The service offers web designers and developers a free and convenient solution for hosting and streaming cross-platform, cross-browser media experiences and rich interactive applications that run on Windows™ XP+ and Mac OS 10.4+.

The only problem is LINUX is left out from this since the Mono Framework has not yet evolved sufficiently.

So, the new way to develop your AJAX RIA “multimedia web application” is – design the UI with an Artist in Adobe Illustrator and mashup with your old RSS, LINQ, JSON, XML-based Web services, REST and WCF Services to deliver a richer scalable web application.

Migrating to ASP.NET 2.0 — Its backward compatible

October 21, 2005

Here are my investigations based on MSDN and a running site at Microsoft since Aug 2005 with better performance than before:

· Because of the way that the .NET Framework is designed, you can deploy the 2.0 framework without disrupting a current installation of the 1.0 or 1.1 frameworks.

To configure a 1.x application’s script map to use the .NET Framework version 2.0

  • On the Start menu, click Run.
  • In the Open box, type inetmgr and click OK.
  • In Internet Information Services (IIS) Manager, expand the local computer, and then expand Web Sites.
  • Select the target Web site that is running in the .NET Framework version 1.x.
  • Right-click the name of the virtual directory for the Web site, and then click Properties.
    The Properties dialog box appears.
  • In the ASP.NET version selection list, choose the .NET Framework version 2.0.
    Click OK.
  • Navigate to a page in your application and confirm that your application runs as expected.

· If you are planning on using ASP.NET 2.0 on a production site, you will need to acquire the Microsoft Visual Studio 2005 Beta 2 Go-Live license (Nov 2005 is the final release of VS .NET 2005, so this may change) or . Basically, Microsoft does not offer support for the pre-release products.
· ASP.NET 2.0 and ASP.NET 1.1 Applications can live on the same IIS Server: By default, your 1.x applications will continue to use the 1.x framework. However, you will have to configure your converted/new applications (web sites/virtual directories) to use the 2.0 framework.
· Requirements for hosting ASP.NET 2.0 Apps:
o Internet Information Services (IIS) version 5.0 or later. To access the features of ASP.NET, IIS with the latest security updates must be installed prior to installing the .NET Framework. (So you can run ASP.NET 2.0 apps on old boxes with IIS5-Win 2000 Server)
o ASP.NET is supported only on the following platforms: Microsoft Windows 2000 Professional (Service Pack 3 recommended), Microsoft Windows 2000 Server (Service Pack 3 recommended), Microsoft Windows XP Professional, and Microsoft Windows Server 2003 family.
o Microsoft Data Access Components 2.8; is recommended. This is for applications that use data access.
o Supported Operating Systems: Windows 2000; Windows 98; Windows 98 Second Edition; Windows ME; Windows Server 2003; Windows XP. Make sure you have the latest service pack and critical updates for the version of Windows that you are running. To find recent security updates, visit Windows Update.
o You must also be running Microsoft Internet Explorer 5.01 or later for all installations of the .NET Framework. Install Internet Explorer 6.0 Service Pack 1.

Here’s what we gain:
New Features in ASP.NET 2.0
· Master pages are a new feature introduced in ASP.NET 2.0 to help you reduce development time for Web applications by defining a single location to maintain a consistent look and feel in a site. Master pages allow you to design a template that can be used to generate a common layout for many pages in the application.
· Content pages (I call them business logic sub-pages) are attached to a master-page and define content for any ContentPlaceHolder controls in the master page. The content page contains controls that reference the controls in the master page through the ContentPlaceHolder ID. The content pages and the master page combine to form a single response.
· Nested Master Pages: In certain instances, master pages must be nested to achieve increased control over site layout and style. For example, your company may have a Web site that has a constant header and footer for every page, but your accounting department has a slightly different template than your IT department.
· Overriding Master Pages: Although the goal of master pages is to create a constant look and feel for all of the pages in your application, there may be situations when you need to override certain content on a specific page. To override content in a content page, you can simply use a content control.
· Themes and Skins: ASP.NET 2.0 rectifies the issue of using CSS and inline styles in ASP.NET 1.1 pages, through the use of themes and skins, which are applied uniformly across every page and control in a Web site.A skin is a set of properties and templates that can be used to standardize the size, font, and other characteristics of controls on a page. Themes are similar to CSS style sheets in that both themes and style sheets define a set of common attributes that apply to any page where the theme or style sheet is applied.
· Security: Managing User Info with Profiles and Login Controls: The membership provider and login controls in ASP.NET 2.0 provide a unified way of managing user information. ASP.NET 2.0 offers new login controls to help create and manage user accounts without writing any code.The ASP.NET 2.0 profile features allow you to define, save, and retrieve information associated with any user that visits your Web site. In a traditional ASP application, you would have to develop your own code to gather the data about the user, store it in session during the user’s session, and save it to some persistent data store when the user leaves the Web site.
· Localizaton. Enabling globalization and localization in Web sites today is difficult, requiring large amounts of custom code and resources. ASP.NET 2.0 and Visual Studio 2005 provide tools and infrastructure to easily build Localizable site including the ability to auto-detect incoming locale’s and display the appropriate locale based UI. Visual Studio 2005 includes built-in tools to dynamically generate resource files and localization references. Together, building localized applications becomes a simple and integrated part of the development experience.
· 64-Bit Support. ASP.NET 2.0 is now 64-bit enabled, meaning it can take advantage of the full memory address space of new 64-bit processors and servers. Developers can simply copy existing 32-bit ASP.NET applications onto a 64-bit ASP.NET 2.0 server and have them automatically be JIT compiled and executed as native 64-bit applications (no source code changes or manual re-compile are required).
· Caching Improvements. ASP.NET 2.0 also now includes automatic database server cache invalidation. This powerful and easy-to-use feature allows developers to aggressively output cache database-driven page and partial page content within a site and have ASP.NET automatically invalidate these cache entries and refresh the content whenever the back-end database changes. Developers can now safely cache time-critical content for long periods without worrying about serving visitors stale data.
· Web Parts: Web Parts are modular components that can be included and arranged by the user to create a productive interface that is not cluttered with unnecessary details. The user can:
o Choose which parts to display.
o Configure the parts in any order or arrangement.
o Save the view from one Web session to the next.
o Customize the look of certain Web Parts.
· Better Development Environment: ASP.NET 2.0 continues in the footsteps of ASP.NET 1.x by providing a scalable, extensible, and configurable framework for Web application development. The core architecture of ASP.NET has changed to support a greater variety of options for compilation and deployment. As a developer, you will also notice that many of your primary tasks have been made easier by new controls, new wizards, and new features in Visual Studio 2005. Finally, ASP.NET 2.0 expands the palette of options even further by introducing revolutionary new controls for personalization, themes and skins, and master pages. All of these enhancements build on the ASP.NET 1.1 framework to provide an even better set of options for Web development within the .NET Framework.
· Last but not the least there’s a host of new language features that reduce code lines in .NET 2.0: What’s New in the C# 2.0 Language and Compiler
With the release of Visual Studio 2005, the C# language has been updated to version 2.0, which supports the following new features:
o Generics
Generic types are added to the language to enable programmers to achieve a high level of code reuse and enhanced performance for collection classes. Generic types can differ only by arity. Parameters can also be forced to be specific types. For more information, see Generic Type Parameters.

o Iterators
Iterators make it easier to dictate how a foreach loop will iterate over a collection’s contents.

o Partial Classes
Partial type definitions allow a single type, such as a class, to be split into multiple files. The Visual Studio designer uses this feature to separate its generated code from user code.

o Nullable Types
Nullable types allow a variable to contain a value that is undefined. Nullable types are useful when working with databases and other data structures that may contain elements that contain no specific values.

o Anonymous Methods
It is now possible to pass a block of code as a parameter. Anywhere a delegate is expected, a code block can be used instead: there is no need to define a new method.

o Namespace alias qualifier
The namespace alias qualifier (::) provides more control over accessing namespace members. The global :: alias allows access the root namespace that may be hidden by an entity in your code.

o Static Classes
Static classes are a safe and convenient way of declaring a class containing static methods that cannot be instantiated. In C# version 1.2 you would have defined the class constructor as private to prevent the class being instantiated.

o External Assembly Alias
Reference different versions of the same component, contained in the same assembly, with this expanded use of the extern keyword.

o Property Accessor Accessibility
It is now possible to define different levels of accessibility for the get and set accessors on properties.

o Covariance and Contravariance in Delegates
The method passed to a delegate may now have greater flexibility in its return type and parameters.

o How to: Declare, Instantiate, and Use a Delegate
Method group conversion provides a simplified syntax for declaring delegates.

o Fixed Size Buffers
In an unsafe code block, it is now possible to declare fixed-size structures with embedded arrays.

o Friend Assemblies
Assemblies can provide access to non-public types to other assemblies.

o Inline warning control
The #pragma warning directive may be used to disable and enable certain compiler warnings.

o volatile
The volatile keyword can now be applied to IntPtr and UIntPtr.

Thanks to the various links by Microsoft for the above info.

Simple SQL Server Performance Tips

July 29, 2005
  1. Always create a data model (ERD).
  2. Consider using an application block or a best practice based design.
  3. Make sure the database is normalized – very important else sql server will not give optimized query plans (Tips for SQL Server 2005 Query Plans) . For the 1 to many (1:m OR m:1) relation, -> ensure that the child table’s primary key has one of its composite keys as the parent table’s primary key. All dependent tables must have the parent-primary-key (foreign key) and a surrogate key as its primary key eg. a Person – Address relationship, or a Product – Attribute relationship. For an m:n relation ensure that the two tables have a third table to hold the primary key combinations of both the related tables eg. a many to many relationship.
  4. Make sure database security is controlled through views/stored procedures and finally roles.
  5. All commonly used joins have indexes on the where condition columns. Remember foreign key constraint doesn’t mean an index.
  6. Always use Inner Joins if possible then Outer Joins . Use Left Outer joins only when foreign keys are nullable. Try to design around NULL (avoid foreign keys being NULL). Use ANSI_NULL to ensure ANSI NULL compatibility. Remember: SELECT * FROM A1 where b not in (SELECT b from B1) would return null if any b is null.
  7. Keep transactions as short as possible.
  8. Reduce lock time. Try to develop your application so that it grabs locks at the latest possible time, and then releases them at the very earliest time.
  9. Always run/display execution plan from query analyzer when testing out stored procs/ad-hoc sql and ensure clustered index seek or nested loops are used. NO HASH JOINs. I/O or hash joins would mean spikes in CPU usage in the performance monitor(perfmon).
  10. Avoid where conditions with functions since SQL Server doesn’t have Function based indices. eg don’t use select a,b from X where CONVERT(date) > ’10/10/2005′, instead move the convert to the RHS constant. This guarantees query exec. plan reuse and also usage of index columns by query plan.
  11. Always run sql profiler and run your client application and ensure that the duration column is not too much, if too much run index tuning wizard which will confirm that no indices are required for the queries.
  12. Always use connection pools for guaranteeing caching of queries results etc. Connection Strings should exactly match for connection pooling, if NT USer use same user while connecting to the database from the client. Remember: NT based connection pooling through delegation doesn’t work correctly in ASP.NET, also it isn’t as scalable as a SQL user based connection pool. You can always encrypt the connection string in the web.config file
  13. SQL Server .NET data provider is the fastest. The SQL Server .NET provider uses TDS (Tabular Data Stream, which is the native SQL Server data format) to communicate with SQL Server. The SQL Server .NET provider can be used to connect to SQL Server 7.0 and SQL Server 2000 databases, but not SQL Server 6.5 databases. If you need to connect to a SQL Server 6.5 database, the best overall choice is the OLE DB.NET data provider.
  14. 2 part name – Always use fully qualify tables/views/stored procs like exec dbo.sp_storeusers or sp_sqlexec rsdb.dbo.sp_storeusers to be compatible with future releases of SQL Server.
  15. SQL Server 2005 places no limits on server RAM, supports XML natively, has an inbuilt tuning advisor and works with the same sql syntax as SQL Server 2000. Constant Scan and other operators of SQL 2005.
  16. Server side cursors are not scalable in SQL Server => avoid .
  17. Cursors are degradable to the next higher cost cursor – when ORDER BY (not covered by index), TOP, GROUP BY, UNION, DISTINCT,.. is used.
  18. Always use DataReaders, then DataTables then DataSets with ADO.NET in that order of performance hit.
  19. Try to use SELECT … (with NOLOCK) hint. NOLOCK gives dirty data, useful only when readers are much more than writers. If appropriate, reduce lock escalation by using the ROWLOCK or PAGLOCK. Consider using the NOLOCK hint to prevent locking if the data being locked is not modified often.
  20. Always de-allocate and close cursors, close connections.
  21. To check io costs – set statistics io on — just get stats for touches on the tables (could be index, clustered index or table)
  22. Non-clustered index leads to a bookmark look-up when the clustered index/rowid data is accessed.
  23. Internationalization: Always use UTC time in database and plan for Unicode. Don’t assume locale and number of users, design for most scalability. Don’t use the NVARCHAR or NCHAR data types unless you need to store 16-bit character (Unicode) data. They take up twice as much space as VARCHAR or CHAR data types, increasing server I/O and wasting unnecessary space in your buffer cache.
  24. ADO.NET calls ad-hoc queries using sp_ExecuteSQL(“…”) so they will be cached, so no problems with search pages but use the same connection string/pooling. select * from syscacheobjects to check cache.
  25. Avoid SELECT * ==> leads to table scan, also have at least 1 clustered index on a table (unless its very small) Because there is no index on the column to use for the query. It must do a table scan to evaluate each row. A table scan is also done if all columns are requested or the where condition doesn’t contain any indices.
  26. use SET => better for assigning single values rather than SELECT eg. SET @a =10
  27. Openxml is costly – it loads the xml parser in sql server so use bulk insert/bulk copy
  28. DBCC – database consistency check (misnomer now!) DBCC FREEPROCACHE (free proc cache) DBCC REINDEX – at night, high cost, table lock, reindex, reapply fill factor which is applied only initially DBCC CHECK – check db consistency DBCC SHOWCONTIG – show defragmentation (extent level, logical scan, scan density) DBCC INDEXDEFRAG – online operation – during day, low cost, page lock, fix logical scan frag.
  29. Maintainance: update statistics every night, reindex every week.
  30. sp_who – show spids currently running and deadlocked ones
  31. Ask for less data over the wire – its better to work like explorer and ask for parent nodes first then child nodes based on user request.
  32. use of DISTINCT is not very scalable => database model error (may not be relational)
  33. Optimizer uses constraints – so use indices,foreign keys etc
  34. Clustered Index Scan or Full Table Scan are because an index is missing, use index tuning wizard with thorough to find the missing index when the application is running and when profiler is used. Index Tuning Wizard can be run on individual queries too from SQL Query Analyser.
  35. SQL Query Optimizer: Select column would affect BOOKMARK LOOKUP, Predicate column (where clause) determines clustered or non-clustered index seek/scan (scan=>between clause), Estimated Resultant rows determines a Clustered Index Scan is to be done or not.
  36. dbcc memorystatus – value of Stolen under Buffer Distribution increase steadily? => either consuming a lot of memory within SQL Server or is not releasing something. When an application acquires a lot of Stolen memory, SQL Server cannot page this to disk like it can for a data or index page. This is memory that must remain in SQL Server’s Buffer Pool and cannot be aged out. If the application is using cursors, memory associated with a cursor requires Stolen Memory while the cursor is open => Perhaps the application is opening up cursors but not closing them before opening a new one.
  37. OR/’in’ clauses are not very performant (most of the time they result in a table scan) ==> use unions for large queries.
  38. Always check for SQL Injection problems including comment web page injection issues.
  39. A view – “virtual table” – based on views would all be materialized in the tempdb during execution so the query plan used would be based on the sql (if it contains CONVERT, RTRIM functions etc in the where clause, the index wouldn’t be used because there are no function based indexes like ORACLE).
  40. Data Types: char == trailing spaces (padded), varchar == no trailing spaces (not-padded).If the text data in a column varies greatly in length, use a VARCHAR data type instead of a CHAR data type. The amount of space saved by using VARCHAR over CHAR on variable length columns can greatly reduce I/O reads, improving overall SQL Server performance. Don’t use FLOAT or REAL data types for primary keys, as they add unnecessary overhead that hurts performance. Use one of the integer data types instead.
  41. Avoid SQL Server Application Roles which do not take advantage of connection pooling
  42. Set following for all stored procs
    SET ANSI_NULLS ON — guarantees ansi null behaviour during concat, IN operations
    SET CONCAT_NULL_YIELDS_NULL ON — any string concat with NULL is NULL
    SET NOCOUNT ON — minimize network traffic.
  43. O/RM – Object-relational mapping – Object-relational mapping, or O/RM, is a programming technique that links relational databases to object-oriented language concepts, creating (in effect) a “virtual object database.”
  44. Simple tips from way to optimize stored procedures:
    • Limit the use of cursors wherever possible. Use temp tables or table variables instead. Use cursors for small data sets only.
    • Make sure indexes are available and used by the query optimizer. Check the execution plan for confirmation.
    • Avoid using local variables in SQL statements in a stored procedure. They are not as optimizable as using parameters.
    • Use the SET NOCOUNT ON option to avoid sending unnecessary data to the client.
    • Keep transactions as short as possible to prevent unnecessary locking.
    • If your application allows, use the WITH (NOLOCK) table hint in SQL SELECT statements to avoid generating read locks. This is particularly helpful with reporting applications.
    • Format and comment stored procedure code to allow others to properly understand the logic of the procedure.
    • If you are executing dynamic SQL use SP_EXECUTESQL instead of EXEC. It allows for better optimization and can be used with parameters.
    • Access tables across all stored procedures in the same logical order to prevent deadlocks from occurring.
    • Avoid non-optimizable SQL search arguments like Not Equal, Not Like, and, Like ‘%x’.
    • Use SELECT TOP n [PERCENT] instead of SET ROWCOUNT n to limit the number of rows returned.
    • Avoid using wildcards such as SELECT * in stored procedures (or any SQL application for that matter).
    • When executing stored procedures from a client, using ADO for example, avoid requesting a refresh of the parameters for the stored procedure using the Parameters.Refresh() command. This command forces ADO to interrogate the database for the procedure’s parameters and causes excessive traffic and application slowdowns.
    • Break large queries into smaller, simpler ones. Use table variables or temp tables for temporary storage, if necessary.
    • Understand your chosen client library (DB-LIB, ODBC, OLE DB, ADO, ADO.Net, etc.) Understand the necessary options to set to make queries execute as quickly as possible.
    • If your stored procedure generates one or more result sets, fetch those results immediately from the client to prevent prolonged locking. This is especially important if your client library is set to use server-side cursors.
    • Do not issue an ORDER BY clause in a SELECT statement if the order of rows returned is not important.
    • Put all DDL statements (like CREATE TABLE) before any DML statements (like INSERT). This helps prevent unwanted stored procedure recompiles.
    • Only use query hints if necessary. Query hints may help performance, but can prevent SQL Server from choosing the best execution plan. A query hint that works today may not work as well tomorrow if the underlying data changes in size or statistical distribution. Try not to out think SQL Server’s query processor.
    • Consider using the SQL Server query governor cost limit option to prevent potentially long running queries from ever executing.

    Best index tuning:

    • Examine queries closely and keep track of column joins and columns that appear in WHERE clauses. It’s easiest to do this at query creation time.
    • Look for queries that return result sets based on ranges of one or more columns and consider those columns for the clustered index.
    • Avoid creating clustered primary keys if the PK is on an IDENTITY or incrementing DATETIME column. This can create hot-spots at the end of a table and cause slow inserts if the table is “write” heavy.
    • Avoid excessive indexes on columns whose statistical distribution indicates poor selectivity, i.e. values found in a large number of rows, like gender (SQL Server will normally do a table scan in this case).
    • Avoid excessive indexes on tables that have a high proportion of writes vs. reads.
    • Run the Index Tuning Wizard on a Coefficient trace file or Profiler trace file to see if you missed any existing indexes.
    • Do not totally rely on the Index Tuning Wizard. Rely on your understanding of the queries executed and the database.
    • If possible, make sure each table has a clustered index, which may be declared in the primary key constraint (if you are using a data modeling tool, check the tool’s documentation on how to create a clustered PK).
    • Indexes take up extra drive space, slow down INSERTs and UPDATEs slightly, and require longer backup/replication times, but since most tables have a much higher proportion of reads to writes, you can usually increase overall performance creating the necessary indexes, as opposed to not creating them.
    • Remember that the order of columns in a multi-column index is important. A query must make use of the columns as they are listed in the index to get the most performance increase. While you don’t need to use all columns, you cannot skip a column in the index and still receive index performance enhancement on that column.
    • Avoid creating unique indexes on columns that allow NULL values.
    • On tables whose writes far outweigh reads, consider changing the FILLFACTOR during index creation to a value that allows for adequate free space on the index pages to allow for optimal table inserts.
    • Make sure SQL Server is configured to auto update and auto create statistics. If these options cause undue strain on the server during business hours and you turn them off, make sure you manually update statistics, as needed. Also, note that sql server trace does cause a strain and slowdown on the server.
    • Consider rebuilding indexes on a periodic basis, by recreating them (consider using the DROP_EXISTING clause), using DBCC INDEXDEFRAG (SQL 2000), or DBCC DBREINDEX. These commands defragment an index and return the fill factor space to the leaf level of each index page. Consider a mix/match of each of these commands for your environment.
    • Do not create indexes that contain the same column. For example, instead of creating two indexes on LastName, FirstName and LastName, eliminate the second index on LastName.
    • Avoid creating indexes on descriptive CHAR, NCHAR, VARCHAR, and NVARCHAR columns that are not accessed often. These indexes can be quite large. If you need an index on a descriptive column, consider using an indexed view on a smaller, computed portion of the column. For example, create a view:
      CREATE VIEW view_nameWITH SCHEMABINDINGASSELECT ID, SUBSTRING(col, 1, 10) as colFROM table     
      Then create an index on the reduced-sized column col:     
      CREATE INDEX name on view_name (col). This index can still be used by SQL Server when querying the table directly (although you would be limited in this example to searching for the first 10 characters only). Note: Indexed views are SQL Server 2000 only.
    • Use surrogate keys, like IDENTITY columns, for as many primary keys as possible. INT and BIGINT IDENTITY columns are smaller than corresponding alpha-numeric keys, have smaller corresponding indexes, and allow faster querying and joining.
    • If a column requires consistent sorting (ascending or descending order) in a query, for example:
      SELECT LastName, FirstNameFROM CustomersWHERE LastName LIKE N%ORDER BY LastName DESC     
      Consider creating the index on that column in the same order, for example:     
      CREATE CLUSTERED INDEX lastname_ndxON customers(LastName, FirstName) DESC. This prevents SQL Server from performing an additional sort on the data.
    • Create covering indexes wherever possible. A covering index covers all columns selected and referenced in a query. This eliminates the need to go to the data pages, since all the information is available in the index itself.

    Benefits of using stored procedures

    • Stored procedures facilitate code reuse. You can execute the same stored procedure from multiple applications without having to rewrite anything.
    • Stored procedures encapsulate logic to get the desired result. You can change stored procedure code without affecting clients (assuming you keep the parameters the same and don’t remove any result sets columns).
    • Stored procedures provide better security to your data. If you use stored procedures exclusively, you can remove direct Select, Insert, Update, and Delete rights from the tables and force developers to use stored procedures as the method for data access.
    • Stored procedures are a part of the database and go where the database goes (backup, replication, etc.).
    • Stored procedures improve performance. SQL Server combines multiple statements in a procedure into a unified execution plan.
    • Stored procedures reduce network traffic by preventing users from having to send large queries across the network.
    • SQL Server retains execution plans for stored procedures in the procedure cache. Execution plans are reused by SQL Server when possible, increasing performance. Note SQL 7.0/2000: this feature is available to all SQL statements, even those outside stored procedures, if you use fully qualified object names.
  45. Top 10 Must Have Features in O/R Mapping Tools at – 1. Flexible object mapping -Tables & views mapping, Multi-table mapping, Naming convention, Attribute mapping, Auto generated columns, Read-only columns, Required columns, Validation, Formula Fields, Data type mapping, 2. Use of existing Domain objects, 3. Transactional operations – COM+/MTS,Stand-alone, 4. Relationships and life cycle management – 1 to 1, many to 1, 1 to many, many to many, 5. Object inheritance – 1 table per object or 1 table for all objects – handling insert, update, delete and load data, 6. Static and dynamic queries, 7. Stored procedure calls,8. Object caching, 9. Customization of generated code and re-engineering support, 10. Code Template Customization
  46. Perform an audit of the SQL Code
    Transact-SQL Checklist

    • Does the Transact-SQL code return more data than needed?
    • Are cursors being used when they don’t need to be?
    • Are UNION and UNION SELECT properly used?
    • Is SELECT DISTINCT being used properly?
    • Does the WHERE clause make use of indexes in search criteria?
    • Are temp tables being used when they don’t need to be?
    • Are hints being properly used in queries?
    • Are views unnecessarily being used?
    • Are stored procedures being used whenever possible?
    • Inside stored procedures, is SET NOCOUNT ON being used?
    • Do any of your stored procedures start with sp_?
    • Are all stored procedures owned by DBO, and referred to in the form of databaseowner.objectname?
    • Are you using constraints or triggers for referential integrity?
    • Are transactions being kept as short as possible?
    • Is the application using stored procedures, strings of Transact-SQL code, or using an object model, like ADO, to communicate with SQL Server?
    • What method is the application using to communicate with SQL Server: DB-LIB, DAO, RDO, ADO, .NET?
    • Is the application using ODBC or OLE DB to communication with SQL Server?
    • Is the application taking advantage of connection pooling?
    • Is the application properly opening, reusing, and closing connections?
    • Is the Transact-SQL code being sent to SQL Server optimized for SQL Server, or is it generic SQL?
    • Does the application return more data from SQL Server than it needs?
    • Does the application keep transactions open when the user is modifying data?
  47. Application Checklist

    Thanks to the authors at and the other sites listed above.

.NET 2.0 Generics samples & Performance Comparison

January 21, 2005

One of the most awaited features of Microsoft .NET 2.0 is generics. Starting with VS 2005, C#, Managed C++, and VB will have CLR support for generics.

Generics promise to increase type safety, improve performance, reduce code duplication(code reuse) and eliminate unnessecary casts(boxing). The most obvious application of generics in the framework class library are the generic collections in the new System.Collections.Generic namespace.

While Generic types do have a similar syntax to C++ templates, they are instantiated at runtime as opposed to compile time (by Microsoft’s C++ compiler), and they can be reflected on via meta-data. Also, in Generics, member access on the type paramater is verified based on the constraints placed on the type parameter; whereas, in templates, member access is verified on the type argument after instantiation. When the MS C++ compiler creates a separate type for every template specialization, that does not necessarily mean every type emits separate code. In fact, you’ll find that through a feature called COMDAT folding, most templates share quite a bit of code. (Basically, the myth of code bloat for templates isn’t true these days.) By having separate specializations of code at compile-time, the compiler has the ability to optimize each type individually, which includes inlining. If every instance of a function is inlined, the template code is thrown away by the linker resulting in less code. For these reasons, template code is generally much faster and leaner than an equivalent generic alternative.

C++ templates are a compile-time feature much like a macro preprocessor and are thus not a good solution for a highly dynamic language such as C#.

In .NET, for value types, generics is considerably faster, and for reference types, generics is typically comparable in performance—or faster.

Good Links: Introducing Generics in the CLR

Generics Performance Results and a sample to test other collections

Generics Performance (CSharp)

The problem with .NET generics

A generic Set type for .NET

6 questions about generics.

Performance: Interfaces Vs. Inheritance (Abstract Base Classes) Vs. Generics

Thanks to the authors for the above links.

.NET Remoting

December 18, 2004

.NET Remoting is gaining a lot of importance so here’s some good links


Thanks to the authors for this info.

Windows/.NET Event logging (with Internationalization/parameter features in a message file)

November 18, 2004

Event logging pre-.NET
When you access the event log using the standard NT API calls, the system stores a structure that contains (amongst other things) the message ID and any replacement strings (“inserts”) for the message — but it does not store the message text itself.
Reading from the log
When you read an entry from an event log, the system reads the stored message ID and replacement strings, gets the text of the message for the current locale from a MESSAGETABLE resource contained within the file specified in the EventMessageFile key in the registry, inserts the replacement strings, and returns you the formatted string.
As well as keeping the log file small (which improves performance when accessing the event log on a remote machine), just storing the message ID and replacement strings also means that the same message can be viewed in different languages as long as the client:Has the file installed that contains that locales MESSAGETABLE The local registry has been configured to tell NT where to find it The file containing the messages only has to be installed on the machine that is doing the reading and does not have to exist on the one that is doing the writing or the one that holds the log (they can all be different machines).
Event logging with .NET
Under .NET, message sources are registered with the EventMessageFile value always set to EventLogMessages.dll, which is installed in the GAC. This file has 65,535 entries, each of which contain a single string: %1In other words, for every possible event ID the entire format string is a placeholder that takes a single replacement string — which is always the message that you pass to EventLog.WriteEntry()

  • The main drawbacks with this approach are:
    You have the responsibility of choosing the locale that should be used to format the message before writing it to the log and so all clients have to view the message in the same language
  • The log file is larger than necessary as it has to hold the full formatted string rather than just the message ID and replacement strings
  • If you want to view the entries written to a remote log on that machine, it must have the .NET runtime installed and the EventLogMessages.dll file registered in the remote computer’s GAC.

Read on for the solution class at….

Thanks to the Author for this public code.

SQL Server: @@IDENTITY deadlock problem and fix

November 18, 2004

This interesting problem occurs only when there is a call to update after the insert and the @@IDENTITY value has to be locked, so there is a deadlock trying to get a hold of this value.

[a] [int] IDENTITY (1, 1) NOT NULL ,
[b] [varchar] (10) NULL ,
[c] [int] NULL ,

Here [a] and [c] have to have the same value.

So, this programmer goes ahead and adds a trigger to do this on the insert operation.

CREATE TRIGGER test_update ON dbo.test
update dbo.test set c = a

And, the insert statement called by two threads(client processes) simultaneously is:

insert into test(b) VALUES (‘test111’)

This leads to a deadlock and this “Error Message:”
“Exception Transaction Process (PID) was deadlocked on lock resources with another process and has been chosen as the deadlock victim.
Rerun the transaction”

The fix:
insert into test(b,c) VALUES (‘test111’,@@IDENTITY)