With ever increasing complexity in the software stacks running on our systems, we are starting to take stuff that feeds us, like power and cooling for granted. Sure, on a global scale we have one of the most reliable power feeds from the net in the Netherlands. This is backed up by diesel engines and a fully redundant power grid inside our primary data center. To get the generated heat out, there’s a fully redundant cooling system in place.
So with all this power and cooling hardware in place, we’re protected against everything… right? Well think again, because the power grid and air conditioning systems are also controlled by…. software! A seemingly harmless software update to the ACU’s inside one of our suites caused a control valve to react in the opposite way its control software thought it was sending them, effectively shutting down cooling and causing a 10 degrees centigrade temperature rise in little over 30 minutes. These are the type of temperature rises which ultimately cause hardware to auto shutdown. In this case, the problem was cleared before reaching critical levels. If it hadn’t, we would have been able to transparently fail everything over to a remote location, since the typical infrastructures we build are based on a twin data center active / active concept.
This again proves that it doesn’t always have to be the often cited ‘plane crash’ which proves the point for building mission critical infrastructures, like our customer’s, inside multiple data centers. Actually, I don’t think there are any recorded events of an airplane crashing into a data center. Instead, something like the firmware controlling your ACU’s can jeopardize all equipment inside a single room or even an entire data center. Plan for failure and expect failure to come from unexpected sources.
All things considered, the twin datacenter active/active configuration is indeed too hot to handle!
After the latest patchround, I had WSUS3.0 break on me. The management snapin kept failing with ‘not responding’, and remote MMC connections weren’t accepted anymore either.
I figured to remove and reinstall, keeping the database and logs, but every reïnstall kept failing and bombing at about 90% out with a dialogue box stating ‘there is something wrong with your installation package’. As I knew for sure the package was fine (I did try both the SP1 and SP2 install..) it must be something else.
The logfile MWusSetup.log located in the Windows temp folder mentioned: ERROR CustomActions.Dll RemovePsfsip: Failed to load dll (Error 0x8007007E: The specified module could not be found.)
After a little googling, I found a lot of references, but not one fully working solution.
What worked for me is this (reboot after every step):
Removed all dotnet installs using a MS utility cleanup_tool.exe
(http://blogs.msdn.com/astebner/attachment/8904493.ashx)
Stop and remove the WsusCertService using the 2003 resource kit utility instsrv.exe
(http://www.microsoft.com/downloads/details.aspx?FamilyID=9D467A69-57FF-4AE7-96EE-B18C4790CFFD&displaylang=en)
Cleaned the registry using ccleaner.
(http://www.ccleaner.com)
Reïnstalled .Net3.5SP1
(http://download.microsoft.com/download/2/0/e/20e90413-712f-438c-988e-fdaa79a8ac3d/dotnetfx35.exe)
Removed the wsus mmc cache files in my profile directory.
This finally allowed me to reïnstall WSUS.
[BBG]

Recently we started evaluating Citrix Edgesight, on a enviroment we are currently building, consisting of XenApp5 2008 x64 and XenDesktop 4 Farms.
After the installation of the EdgeSight agent, suddenly a bunch of applications running within a Java Virtual machine stopped functioning. Throwing the “Could not launch the java virtual machine” error.
These Java apps tried allocating quite some memory using these java arguments (eg: XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=35 -XX:NewRatio=2″ initial-heap-size=”32m” max-heap-size=”1024m”)
After some investigation a colleague (Hugo Trippaers) found out that there was only 0,9 GB of memory allocatable on our Citrix XenApp machines using the memtest32.exe tool. While our other servers happily reported 1,5 GB of allocatable memory (Within WOW64). (Physical Machine = HP DL380G6 with 48 GB of memory, uh should be enough?)
After some deeper digging using memalloc.exe, I discover some substantial differences in memory allocation between our XenApp Servers with the edgesight agent installed and servers without the EdgeSight agent.
XenApp servers with Edgesight Agent 5.2 SP1 x64: memalloc.exe with edgesight
XenApp Servers without edgesight: memalloc.exe – without edgesight
The main difference here is all the Citrix hooks being loaded, see below.
This apparently consumes so much memory that it was not possible for java to allocate enough memory.
For more insights on WOW64 look here: http://blogs.msdn.com/gauravseth/archive/2006/04/26/583963.aspx
By default 32bit applications within WOW64 can leverage the full 4 GB of memory availlable, which is not possible on a native 32 bit system because of the separation of kernel and user space.
Applications need to be compiled with /largaddressaware (Visual Studio : http://msdn.microsoft.com/en-us/library/wz223b1z(VS.80).aspx) or patched using editbin (http://bilbroblog.com/wow64/hidden-secrets-of-w0w64-ndash-large-address-space/), to fully use the 4 GB availlable otherwise they can only allocate 1,6 GB of memory.
We will open a case with Citrix on this; to be continued.
Citrix hooks being loaded when edgesight is installed:
Read more…
Categories: Citrix, Edgesight, Microsoft, Windows 2008, XenApp, XenDesktop Tags: Edgesight, java, memory, Windows 2008, wow64, XenApp, XenDesktop