center of tech
pickens writes “Dan Berry writes in the NY Times that the State of Alabama is spending millions of dollars in federal stimulus money to combat Cogongrass, a.k.a. the perfect weed, the killer weed, and the weed from another continent. A weed that ‘evokes those old science-fiction movies in which clueless citizens ignore reports of an alien invasion.’ Cogongrass (Imperata cylindrica) is considered one of the 10 worst weeds in the world. ‘It can take over fields and forests, ruining crops, destroying native plants, upsetting the ecosystem,’ writes Berry. ‘It is very difficult to kill. It burns extremely hot. And its serrated leaves and grainy composition mean that animals with even the most indiscriminate palates — goats, for example — say no thanks.’ Alabama’s overall strategy is to draw a line across the state at Highway 80 and eradicate everything north of it; then, in phases, to try to control it to the south. But the weed is so resilient that you can’t kill it with one application of herbicide, you have to return several months later and do it again. ‘People think this is just a grass,’ says forester Stephen Pecot. ‘They don’t understand that cogongrass can replace an entire ecosystem.’ Left unchecked, Pecot says ‘it could spread all the way to Michigan.’”
Read more of this story at Slashdot.
First of all, contents in this entry are sort of hacking rather than the OpenSolaris solution to support iSCSI boot device. The final solution will be coming from caiman project rather than here, AFAIK caiman is actively working on the support and the draft plan may be available soon.
Currently OpenSolaris can’t be installed to iSCSI boot device with the liveCD, the major issue here is that the iSCSI initiator module is not included in the liveCD. That basically limits the ability to access to the iSCSI target.
However with customizing, or hacking in another word, the AI process, it is not that difficult to experience iSCSI boot with OpenSolaris.
Requirements:
For x86, the build number of OpenSolaris should be 104+, and the machine should have at least two NICs - one to support PXE and the other to support iBFT.
For sparc, the build number of OpenSolaris should be 127+(per current plan), and the machine should have an updated OBP (should be coming out soon) to support iSCSI boot.
Seteps:
1. First an AI server needs to be configured following the AI instruction.
2. Modify the default manifest to specify the iSCSI target info by adding a harmless comment, e.g.,
<!–
iscsi-target-name=iqn.1986-03.com.sun:02:1234567890abcdef
iscsi-target-ip=129.158.144.200
iscsi-lun=1
–>
3. Configure a default ‘target_device’ in manifest, this can be inserted before the
<ai_pkg_repo_default_authority> section.
<ai_target_device>
<target_device_name>0</target_device_name>
<target_device_install_slice_number>0</target_device_install_slice_number>
</ai_target_device>
4. Also don’t forget to specify the iSCSI initiator package along with IDM in the manifest by adding following items into the <ai_install_packages> section.
<pkg name="SUNWiscsi"/>
<pkg name="SUNWiscsidm"/>
5. Customizing(hacking) the auto-installer in the microroot.
In case of x86 and the AI target directory on AI server is /export/home/ai_server, the microroot can be customized in this way.
# cd /export/home/ai_server/boot
# gzcat x86.microroot >/tmp/miniroot
# lofiadm -a /tmp/miniroot
/dev/lofi/1
# mount /dev/lofi/1 /mnt
Then open the /mnt/lib/svc/method/auto-installer with a preferred editor, locate the following paragragh.
===============================================
echo "Automated Installation started" | $TEE_LOGTOCONSOLE
echo "The progress of the Automated Installation can be followed by viewing " \
"the logfile at /tmp/install_log" | $TEE_LOGTOCONSOLE
===============================================
Not this is a shell script to be executed on client side, so here we need to put some customized commands to,
1) Establish the connection to the iSCSI target
2) Identify the iSCSI disk OS name
3) Update the manifest to include the iSCSI disk.
One way to do this would be add following commands just below the above paragraph.
# ========Below will add iscsi configuration for AI client. ============
echo "begin iSCSI configuration…" | $TEE_LOGTOCONSOLE
# get target name
input=`cat $AISC_MANIFEST | grep iscsi-target-name=`
target_name=`echo $input | awk -F"=" ‘{print$2}’ `
# get target ip address
input=`cat $AISC_MANIFEST | grep iscsi-target-ip=`
target_ip=`echo $input | awk -F"=" ‘{print$2}’ `
# get lun number
input=`cat $AISC_MANIFEST | grep iscsi-lun=`
lun=`echo $input | awk -F"=" ‘{print$2}’ `
lun="LUN: $lun"
echo "Destination LUN from manifest is $lun on target $target_name" | $TEE_LOGTOCONSOLE
# add the static-config and enable the discovery
/usr/sbin/iscsiadm add static-config $target_name,$target_ip
/usr/sbin/iscsiadm modify discovery -s enable
# wait here for a while
sleep 10
/usr/sbin/devfsadm -C
sleep 30
/usr/sbin/iscsiadm list target -S >/tmp/client_target.out
test=`cat /tmp/client_target.out | grep "$lun" | wc -l`
if [ $test = "0" ] ; then
echo "can’t find $lun on target $target_name" | $TEE_LOGTOCONSOLE
exit $SMF_EXIT_ERR_FATAL
fi
# get the os device name of the LUN
i=`sed -n -e /"$lun"/= /tmp/client_target.out`
line=`expr "$i" "+" 3`
string=`sed -n -e ${line}p /tmp/client_target.out`
tmp=`echo ${string} | awk -F"/" ‘{print$4}’ `
name=`echo $tmp | sed ’s/..$//’ `
echo "Get $name from local disk table for installation" | $TEE_LOGTOCONSOLE
# replace the device name in the manifest
cat $AISC_MANIFEST | sed "s+<target_device_name>.+<target_device_name>${name}+" >/tmp/ai_combined_manifest.xml.2
mv /tmp/ai_combined_manifest.xml.2 $AISC_MANIFEST
echo "iSCSI configuration completes" | $TEE_LOGTOCONSOLE
# =============end of getting iscsi configuration ==================
6. Save the script and umount/delete the lofi device.
# umount /mnt
# lofiadm -d /dev/lofi/1
7. Repack and replace the microroot.
# gzip miniroot
# mv miniroot.gz /export/home/ai_server/boot/x86.microroot
Now go ahead to install the client, good luck!

Source : http://blogs.sun.com/cancel/entry/hacking_ai_installation_process_to
I’ve finally worked out how to drive purple-url-handler. Strictly
John worked it out, so I
will stand
on his shoulders, but for some reason it would not work for me
and I now know why and have a workaround.
First you need an XMPP URI on a web page. Some thing like:
xmpp:chrisg_fans@muc.im.sun.com?join
will when clicked in a browser that has the right helper,
something OpenSolaris has had for some time, will take your IM client
to that room. However with pidgin that is only the case if that room
is available in the first XMPP server listed in your list of
accounts. So given that this room is on Sun’s IM server with the list
of accounts looking like this:

It
will try and connect to the first XMPP server listed, which is google
and hence fail. Changing the order to be:

and
then logging in and out and now the link will work. You can drag and
drip the entries in pidgin.
Source : http://blogs.sun.com/chrisg/entry/pidgin_url_handler_and_xmpp
An exclusive symposium on increasing energy efficiency through enterprise IT was organized in July 2009 at Bangalore. Where I shared my experiences on creating datacenters, managing energy efficiency and the challenges across APAC region.
|
With Moore’s law in action, computing capabilities are increasing The need of the hour is to get equipped with new technologies |
|
Earlier business units were working as silos, but now it’s like a There’s a real good opportunity to cut costs in datacenters, |
![]() |
In India power outages and the rising cost of energy are important issues to tackle, as they have a direct impact on businesses.
Source : http://blogs.sun.com/kvr/entry/walking_the_green_mile

It is not so difficult to create a NumberBox control using the Textbox contorl, triggers and data binding:
import javafx.scene.*;
import javafx.scene.control.*;
public class NumberBox extends CustomNode {
public var value:Number on replace { str = "{value}"};
var str:String on replace{
try{
value = Number.valueOf(str);
} catch(e){
str = "0";
}
}
public override function create(): Node {
TextBox {
columns: 12
selectOnFocus: true
text: bind str with inverse
}
}
}
Example with the NumberBox control:
import javafx.stage.*;
import javafx.scene.*;
import javafx.scene.layout.*;
import javafx.scene.control.*;
var num = 10.0;
Stage {
title: "NumberBox"
width: 250
height: 280
scene: Scene {
content: VBox{
translateX: 20
translateY: 20
content: [
Slider {
min: 0
max: 100
value: bind num with inverse
}
NumberBox{
value: bind num with inverse
}
]
}
}
}
Source/Kaynak : http://blogs.sun.com/alexsch/entry/numberbox_control
Interesante referencia de Sun Rays en Cabo Verde. Merece la pena dedicale 2 minutos al video que MGO Consulting ha colgado en Youtbue.
http://www.youtube.com/watch?v=7PF9f6ngadc
Source/Kaynak : http://blogs.sun.com/pedrovalcarcel/entry/sun_rays_mindelo_cape_verde
Basically the install process is very similar to the process of installing Solaris x86 onto iSCSI disk. The biggest difference is the way to configure different firmware, as before booting to Solaris, x86 platform will be relying on iBFT-capable firmware (BIOS) to communicate with the iSCSI target, while Sparc platform will be relying on OBP to do the almost same thing.
Before proceeding, please make sure the system is running OBP version >= 4.31 and the command ’show-iscsi’ is available.
Collecting following items before starting the installation, some of them will be used during
the installation process, and may also be a part of the boot argument to ‘boot’ command in OBP.
* iSCSI Target IP/Port
* Router/Gateway IP if the iSCSI Target is on a different subnet
* Which ethernet interface to be used to access the iSCSI target
* Lun number which will be used as the root disk
The installation process is very similar to the x86 case as described in an earlier post, for
both the cd/network installation and desktop/console session. However a few items are needed to be collected for later use to boot the OS.
* Target Name
* Root Slice if it is not ‘a’ as default
Also, specifying chap via ‘iscsiadm’ if authentication is setup in target side. For detailed steps please refer to the Chap. 14 of System Administration Guide: Devices and File Systems.
A special boot device argument needs to be composed to perform iSCSI boot in OBP, which is in the format of,
‘net:key=value[,...]‘
The following keys are used to support iSCSI boot,
iscsi-target-ip <Required> iSCSI Target IP address
iscsi-target-name <Required> iSCSI Target Name
host-ip <Required> Host IP address
router-ip <Optional> The gateway IP address. It may not be necessary if the host and the iSCSI target are within the same subnet.
iscsi-lun <Optional> The lun unit number required by iscsi boot. It is a hexadecimal dash-separated format, defaults to 0. A example of the fully specified number would be 2-0-0-0, however usually it is specified as ‘2′.
iscsi-port <Optional> iSCSI target IP port. It is a decimal formatted integer from 1 to 65535, defaults to 3260.
iscsi-partition <Optional> The bootable partition on the iscsi target, defaults to "a".
If you have used the CHAP as the authentication method, you can set the CHAP user name and password as follows in OK mode:
{0} ok set-ascii-security-key chap-user <your chap name>
{0} ok set-ascii-security-key chap-password <your chap secret>
Note, bidirectional authentication is not available here.
An example of the full argument would be,
net:iscsi-target-name=iqn.1986-03.com.sun:2510.600a0b800049c94d00000000493c920b,host-ip=10.13.49.129,iscsi-lun=3-0-0-0,iscsi-target-ip=10.13.49.145,router-ip=10.13.49.1
An dev alias is probably preferred for such an argument, and then passed to the ‘boot’ command in OBP.
Source/Kaynak : http://blogs.sun.com/cancel/entry/guidance_for_installing_solaris_nevada
via Expert challenges UFO hacker’s $700k bill | 22 Sep 2009 | ComputerWeekly.com.
The US inflated the $700,000 bill for damages it slapped on UFO hacker Gary McKinnon by stuffing it with costs incurred for patching the gaping holes the hacker had exposed in its computer security, according to a document filed with the Supreme Court.
The US had not taken reasonable steps to protect its security and now expects McKinnon to pick up the bill, said an expert witness statement made in McKinnon's ongoing appeal against a US extradition order.
Peter Sommer, professor of security at the London School of Economics, said damage assessments of computer security breaches should consider “whether the victims have taken reasonable steps to limit the damage”.
[...]
“Any firewall also ought to block the 'ports' [internet access points on a computer] used by Remotely Anywhere. On this basis, the costs claimed for are features that should have been there in the first place.”
Sommer, who once advised insurers underwriting the risks of computer damage, said hackers could not be held accountable for the “consequential loss” resulting from their intrusion into systems unprotected by “preventative measures for reasonably foreseeable hazards”.
“Insurers will not insure computers or computer-dependent businesses in the absence of reasonable levels of protection and means of recovery,” he said.
But security experts in the US said McKinnon should be liable for the full $700,000 of security checks performed in his wake.
Professor Eugene Spafford, founder of the Center for Education and Research in Information Assurance and Security at Indiana's Purdue University, said the victim of a cybercrime should not take the blame. If someone broke a door to rob a store, he said, it was usual to charge them the cost of the door.
Anthony Reyes, a former cybercrime detective who helped develop the US Cyber Counter Terrorism Investigations Program, said, “Just because security is weak, it doesn't give you a red flag to go into a computer system and start browsing around.”
Count me with Peter Sommer on this one; I consider Reyes’ “red flag” quote to be specious, and respect Spaf as I greatly do, walking up to a door and through it regardless of a presumed “No Entry” sign does not constitute “breaking it down”; maybe faffing with buffer overflows does but having recently had 5 doors replaced at £200 per diem I am well aware of the difference between replacing broken doors and configuring a firewall properly.
Also: firewall rules do not need to be painted or weatherproofed, and they are more easily draught-proofed – at least, if they are not being installed by the US Military.
There is a perpetual tension in security analogies between the physical and virtual worlds, and all analogies break down eventually. My distribution of Crack back in the 90s was described as “handing out guns” (example response) – yet today it’s mostly forgotten, and the software which usurped it[1] is on the verge of being forgotten, too.
Nowadays there are just far too many other ways to hack, and the security challenge today exceeds the capabilities of the security generalist; that’s probably a good thing, it guarantees us all employment –
– but it also does increase the scope for bad analogy. NMap was bad and became good, Stumblers were evil – and WarChalking was the sigil of the beast, even if I never saw any – yet now every phone has a “Wifi Scanner” application.
It’s all a matter of getting over the neophobia.
–
[1] cloning Crack’s dictionary generation in the process – imitation is the sincerest, Solar?
this posting is syndicated from dropsafe
Expert challenges UFO hacker’s $700k bill | 22 Sep 2009 | ComputerWeekly.com
Source/Kaynak : http://www.crypticide.com/dropsafe/article/3473
A hwsw.hu tegnapi cikke: "Az Oracle számára a Sun hardverrészlege kulcsfontosságú, azonban nem ahhoz, hogy belépjen a hardverpiacra, hanem mert rendszerszállítóként elengedhetetlen számára a saját hardverplatform - vázolta az Oracle terveti egy tegnapi üzleti fórumon Larry Ellison, a cég vezére."
Source/Kaynak : http://blogs.sun.com/sunhu/entry/hwsw_hu_cikk_ellison_az
This Thursday (September 24th, 14:00 UTC), Heikki Tuuri, the father of InnoDB, will give a session on Concurrency Control: How It Really Works. He’ll describe how InnoDB manages concurrency control, so that the system protects data integrity. Beginning with the basics of transaction management, Heikki will include a discussion of the ACID (atomicity, consistency, isolation, and durability) properties, and explain various transaction modes, locking, deadlocks, and more advanced topics such as the impact of next-key (gap) locking, referential integrity, XA (distributed transaction management) support, and more. While the discussion will focus on the InnoDB implementation, many of the concepts presented apply to other database systems and storage engines.
For MySQL University sessions, point your browser to this page. You need a browser with a working Flash plugin. You may register for a Dimdim account, but you don’t have to. (Dimdim is the
conferencing system we’re using for MySQL University sessions. It
provides integrated voice streaming, chat, whiteboard, session
recording, and more.) All MySQL University
sessions are recorded, that is, slides and voice can be viewed as a
Flash movie (.flv). You can find those recordings on the respective
MySQL University session pages which are listed on the MySQL University
home page.
MySQL
University is a free educational online program for
engineers/developers. MySQL University sessions are open to anyone, not
just Sun employees. Sessions are recorded (slides and audio), so if you
can’t attend the live session you can look at the recording anytime
after the session.
Here’s the schedule for the upcoming weeks:
The schedule is not engraved in stone at this point. Please visit http://forge.mysql.com/wiki/MySQL_University#Upcoming_Sessions for the up-to-date list. On that page, you can also find the starting time for many time zones.
Source/Kaynak : http://blogs.sun.com/mysqlf/entry/mysql_university_concurrency_control_how
Last Thursday (September 17, 13:00 UTC), Lars Thalmann explained the Architecture of MySQL Backup. Lars is leading the MySQL Replication & Backup teams, and has given several MySQL University sessions before.
I was on sick leave last week and forgot to announce this session - sorry! However, since the session was recorded (video & audio), you can listen to it anytime. Please find the recording and the presentation slides on this page.
MySQL
University is a free educational online program for
engineers/developers. MySQL University sessions are open to anyone, not
just Sun employees. Sessions are recorded (slides and audio), so if you
can’t attend the live session you can look at the recording anytime
after the session.
Here’s the schedule for the upcoming weeks:
The schedule is not engraved in stone at this point. Please visit http://forge.mysql.com/wiki/MySQL_University#Upcoming_Sessions for the up-to-date list. On that page, you can also find the starting time for many time zones.


Source/Kaynak : http://blogs.sun.com/mysqlf/entry/mysql_university_architecture_of_mysql
Hace diez meses Sun anunciaba la gama Sun Storage 7000, un "appliance" de almacenamiento dentro de la familia de Open Storage construido combinando servidores, almacenamiento (discos y memorias flash) y una amplia gama de software de almacenamiento de nivel empresarial basado en software libre que sacando el máximo provecho a ZFS ofrecía al usuario una solución de muy altas prestaciones de velocidad, seguridad, disponibilidad, ahorro energético, con gran facilidad de uso e instalación y, por supuesto, multiplataforma.
John Fowler, EVP de Sistemas comenta en el vídeo que acompaño que antes del primer año ya se han entregado más de 35PB (35un.000 TB) en os miles de sistemas. Aprovecha para anunciar mejoras para añadir rendimiento y simplificar aún más la gestión.
De las dieciseis mejoras incorporadas destacan:
Con esta actualización no sólo se consiguen más prestaciones sino que se soportan fallos de hasta tres discos,
reduciendo el riesgo de pérdida de datos cuando se usan discos de gran capacidad con tiempos largos de recuperación.
Y. por supuesto. mantiene sus características multiplataforma, siendo agnóstico al tipo de servidores a los que se conecta.
En España hay ya una buena base instalada en educación y empieza a calar en la administración pública y la sanidad, lastrada inicialmente por inercias de la base instalada y por compras masivas como hizo Red.es cuando aún no estaba disponible esta solución. Pero el impulso de la Ley 11/2007 y de las soluciones de EMR y PACS en sanidad en un contexto de fuertes ajustes presupuestarios colocan al Open Storage en una posición de salida muy atractiva por sus prestaciones y su precio.
Información ejecutiva sobre la familia Sun Storage 7000 en este vídeo de menos de cinco minutos. Información más detallada en este enlace. Para probarlo durante 60 días sin compromiso se puede consultar esta dirección.
Source/Kaynak : http://blogs.sun.com/eloy/entry/open_storage_el_beb%C3%A9_ya
Today we have released an update to the
Sun Storage
7410,
which upgraded the CPUs from Barcelona to Istanbul:
| 7410 (Barcelona) | 7410 (Istanbul) |
|---|---|
| Max 4 sockets quad core AMD Opteron CPU | Max 4 sockets of six core AMD Opteron CPU |
| Max 128 Gbytes DRAM | Max 256 Gbytes DRAM |
| HyperTransport 1.0 | HyperTransport 3.0 |
This is per head node, so a 2-way cluster can bring half a Terabyte
of DRAM for filesystem caching.
But what has me most excited is the upgrade of main system bus,
from AMD’s HyperTransport 1 to HyperTransport 3. In this blog post I’ll
explain why, and post numbers with the new 7410.
The following screenshots show the Maintenance->Hardware screen from the original and the new 7410:
The following results were collected from the two 7410s shown above.
| Workload | 7410 (Barcelona) | 7410 (Istanbul) | Improvement |
|---|---|---|---|
| NFSv3 streaming cached read | 2.15 Gbytes/sec | 2.68 Gbytes/sec |
25% |
| NFSv3 8k read ops/sec | 127,143 | 223,969 | 75% |
A very impressive improvement from what were already great results.
Both of these results are reading a cached working set over NFS, so the
disks are not involved. The CPUs and HyperTransport were upgraded, and these
cached workloads were chosen to push those components to their limits (not the
disks), to see the effect of the upgrade.
The following screenshots are the source of those results, and were taken
from Analytics on the 7410 - showing what the head node really did. These
tests were performed by Sun’s Open Storage Systems group (OSSG). I was able to login
to Analytics
on their systems and take screenshots from the tests they
performed after the fact (since
Analytics
archives this data) and check
that these results were consistent with my own - which they are.
Streaming cached read:
Notice that we can now reach 2.15 Gbytes/sec for NFSv3 on the original 7410 (60 second average of network throughput, which includes protocol headers.)
When I first blogged about the 7410 after launch, I was reaching 1.90
Gbytes/sec; sometime later that became 2.06 Gbytes/sec. The difference is the
software updates - we are gradually improving our performance release after
release.
8k cached read ops:
As a sanity check, we can multiply the observed NFS read ops/sec by their
known size - 8 Kbytes: 127,143 x 8 Kbytes = 0.97 Gbytes/sec. Our observed
network throughput was 1.01 Gbytes/sec, which is consistent with 127K x 8 Kbyte
read ops/sec (higher as it includes protocol headers.)
Streaming cached read:
2.68 Gbytes/sec - awesome!
8k cached read ops:
This is 75% faster than the original 7410 - this is no small hardware
upgrade! As a sanity test, this showed 223,969 x 8 Kbytes = 1.71 Gbytes/sec.
On the wire we observed 1.79 Gbytes/sec, which includes protocol headers. This is consistent with the expected throughput.
The systems tested above were the Barcelona-based and Istanbul-based 7410, both with max CPU and
DRAM, and both running the latest software (2009.Q3.) The same 41 clients were used to
test both 7410s.
The Sun Storage 7410 could support four ports of 10 GbE, with a theoretical
combined maximum throughput of 40 Gbit/sec, or 4.64 Gbytes/sec. However in practice it was
reaching about 2.06 Gbytes/sec when reading cached data over NFS. While
over 2 Gbytes/sec is fantastic (and very competitive), why not over 3 or 4
Gbytes/sec?
First of all, if you keep adding high speed I/O cards to a system, you may run out of system resources to drive them before you run out of slots to plug them into. Just because the system lets you plug them all in, doesn’t mean that the CPUs, busses and software can drive it at full speed. So, given that, what specifically stopped the 7410 from going faster?
It wasn’t CPU horsepower: we had four sockets of quad-core Opteron and
the very scalable Solaris kernel. The bottleneck was actually the
HyperTransport.
The HyperTransport is used as the CPU interconnect and the path to the
I/O controllers. Any data transferred with the I/O cards (10 GbE cards, SAS
HBAs, etc), will travel via the HTs. It’s also used by the CPUs so they can
access each other’s memory. In the diagram above, picture CPU0 accessing the memory which
is directly attached to CPU3 - which would require two hops over HT links.
A clue that the HyperTransport (and memory busses) could be the bottleneck
was found with the Cycles Per Instruction (CPI):
walu# ./amd64cpi-kernel 5
Cycles Instructions CPI %CPU
167456476045 14291543652 11.72 95.29
166957373549 14283854452 11.69 95.02
168408416935 14344355454 11.74 95.63
168040533879 14320743811 11.73 95.55
167681992738 14247371142 11.77 95.26
[...]
amd64cpi-kernel
is a simple script I wrote (these scripts are not supported by Sun), to pull the CPI from the AMD CPU PICs (Performance Instrumentation Counters.) The higher the CPI, the more
cycles are waiting for memory loads/stores, which are stalling instructions. A CPI of
over 11 is the highest I’ve ever seen - a good indication that we are waiting
a significant time for memory I/O.
Also note in the amd64cpi-kernel output that I included %CPU - CPU
utilization. With a CPU utilization of over 95%, how many of you would be
reaching for extra or faster CPU cores to improve the system? This is a problem
for all %CPU measurements - yes, the CPU was processing instructions, but it
wasn’t performing ‘work’ that you assume - instead those instructions are
stalled waiting for memory I/O. Add faster CPUs, and you stall faster (doesn’t
help.) Add more cores or sockets, and you could make the situation worse -
spreading the workload over more CPUs can decrease the L1/L2 CPU cache hit
rates, putting even more pressure on memory I/O.
To investigate the high CPI, I wrote more scripts to figure out what the
memory buses and HT buses were doing. My
amd64htcpu script shows
the HyperTransport transmit Mbytes/sec, by both CPU and port (notice in the
diagram each CPU has 3 HT ports.):
walu# ./amd64htcpu 1
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 3170.82 595.28 2504.15 0.00
1 2738.99 2051.82 562.56 0.00
2 2218.48 0.00 2588.43 0.00
3 2193.74 1852.61 0.00 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 3165.69 607.65 2475.84 0.00
1 2753.18 2007.22 570.70 0.00
2 2216.62 0.00 2577.83 0.00
3 2208.27 1878.54 0.00 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 3175.89 572.34 2424.18 0.00
1 2756.40 1988.03 578.05 0.00
2 2191.69 0.00 2538.86 0.00
3 2186.87 1848.26 0.00 0.00
[...]
This shows traffic on socket 0, HT0 was over 3 Gbytes/sec. These
HyperTransport 1 links have a theoretical maximum of 4 Gbytes/sec, which we are
approaching. While they may not be at 100% utilization (for a 1 second
interval), we have multiple cores per socket trying to access a resource that
has reasonably high utilization - which will lead to stalling.
After identifying memory I/O on HyperTransport 1 as a potential bottleneck, we were able
to improve the situation a few ways:
With these changes, our performance improved and the CPI was down to about
10. To go further, we needed HyperTransport 3.
HT3 promised to triple the bandwidth, however when I first got a prototype
HT3 system I was dissapointed to discover that the max NFSv3 throughput was the
same. It turned out that I had been sent
upgraded CPUs, but on a HT1 system. If anything, this further confirmed what I
had suspected - faster CPUs didn’t help throughput, we needed to upgrade the
HT.
When I did get a HT3 system, the performance was considerably better -
between 25% and 75%. HT links:
topknot# ./amd64htcpu 1
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 5387.91 950.88 4593.76 0.00
1 6432.35 5705.65 1189.07 0.00
2 5612.83 3796.13 6312.00 0.00
3 4821.45 4703.95 3124.07 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 5408.83 973.48 4611.83 0.00
1 6443.95 5670.06 1166.64 0.00
2 5625.43 3804.35 6312.05 0.00
3 4737.19 4602.82 3060.74 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 5318.46 971.83 4544.80 0.00
1 6301.69 5558.50 1097.00 0.00
2 5433.77 3697.98 6110.82 0.00
3 4581.89 4464.40 2977.55 0.00
[...]
HT3 is sending more than 6 Gbytes/sec over some links. The CPI was down to
6, from 10. The difference to performance numbers was huge:
Along with the CPU upgrade (which helps IOPS), and DRAM upgrade (helps
caching working sets), the 7410 hardware update was looking to be an incredible upgrade
to what was already a powerful system.
If I’ve wet your appetite for more CPU PIC analysis, on Solaris run “cpustat
-h” and fetch the document it refers to, which will contain a reference for the
CPU PICs for the platform you are on. The scripts I used above are really not
that complicated - they use shell and perl to wrap the output (as the man page
for cpustat even suggests!) Eg, the amd64cpi-kernel tool was:
#!/usr/bin/sh
#
# amd64cpi-kernel - measure kernel CPI and Utilization on AMD64 processors.
#
# USAGE: amd64cpi-kernel [interval]
# eg,
# amd64cpi-kernel 0.1 # for 0.1 second intervals
#
# CPI is cycles per instruction, a metric that increases due to activity
# such as main memory bus lookups.
interval=${1:-1} # default interval, 1 second
set -- `kstat -p unix:0:system_misc:ncpus` # assuming no psets,
cpus=$2 # number of CPUs
pics='BU_cpu_clk_unhalted,sys0' # cycles
pics=$pics,'FR_retired_x86_instr_w_excp_intr,sys1' # instructions
/usr/sbin/cpustat -tc $pics $interval | perl -e '
printf "%16s %16s %8s %8s\n", "Cycles", "Instructions", "CPI", "%CPU";
while (<>) {
next if ++$lines == 1;
split;
$total += $_[3];
$cycles += $_[4];
$instructions += $_[5];
if ((($lines - 1) % '$cpus') == 0) {
printf "%16u %16u %8.2f %8.2f\n", $cycles,
$instructions, $cycles / $instructions, $total ?
100 * $cycles / $total : 0;
$total = 0;
$cycles = 0;
$instructions = 0;
}
}
'
A gotcha for this one is the “sys” modifier on the pics definitions; they make these PICs record activity during both user-code and kernel-code, not just user-code.
I’ve previously posted
many
numbers
covering 7410 performance, although I had yet to collect the full set. I
was missing iSCSI, FTP, HTTP and many others. This hardware upgrade changes
everything - all my previous numbers are now out of date. The numbers
for the new 7410 are so far between 25% and 75% better than what I had posted
previously!
Performance testing is like painting the Golden
Gate Bridge: once you reach the end you must immediately begin at the start
again. In our case, there are so many software and hardware upgrades that
once you approach completing perf testing, the earlier numbers are out of date.
The OSSG group (who gathered the numbers at the start of this post)
are starting to help out so that we can test and share numbers more
quickly.
I’ve created a new column of numbers on my
summary
post, and I’ll fill out the new numbers as I get them.
For this 7410 upgrade, the extra CPU cores help - but it’s more about the
upgrade to the HyperTransport. HT3 provides 3x the CPU interconnect
bandwidth, and dramatically improves the delivered performance of the 7410:
from 25% to 75%. The 7410 was already a powerful server, it’s now raised the
bar even higher.
Source/Kaynak : http://blogs.sun.com/brendan/entry/7410_hardware_update_and_analyzing
Today we have released an update to the
Sun Storage
7410,
which upgraded the CPUs from Barcelona to Istanbul:
| 7410 (Barcelona) | 7410 (Istanbul) |
|---|---|
| Max 4 sockets quad core AMD Opteron CPU | Max 4 sockets of six core AMD Opteron CPU |
| Max 128 Gbytes DRAM | Max 256 Gbytes DRAM |
| HyperTransport 1.0 | HyperTransport 3.0 |
This is per head node, so a 2-way cluster can bring half a Terabyte
of DRAM for filesystem caching.
But what has me most excited is the upgrade of main system bus,
from AMD’s HyperTransport 1 to HyperTransport 3. In this blog post I’ll
explain why, and post numbers with the new 7410.
The following screenshots show the Maintenance->Hardware screen from the original and the new 7410:
The following results were collected from the two 7410s shown above.
| Workload | 7410 (Barcelona) | 7410 (Istanbul) | Improvement |
|---|---|---|---|
| NFSv3 streaming cached read | 2.15 Gbytes/sec | 2.68 Gbytes/sec |
25% |
| NFSv3 8k read ops/sec | 127,143 | 223,969 | 75% |
A very impressive improvement from what were already great results.
Both of these results are reading a cached working set over NFS, so the
disks are not involved. The CPUs and HyperTransport were upgraded, and these
cached workloads were chosen to push those components to their limits (not the
disks), to see the effect of the upgrade.
The following screenshots are the source of those results, and were taken
from Analytics on the 7410 - showing what the head node really did. These
tests were performed by Sun’s Open Storage Systems group (OSSG). I was able to login
to Analytics
on their systems and take screenshots from the tests they
performed after the fact (since
Analytics
archives this data) and check
that these results were consistent with my own - which they are.
Streaming cached read:
Notice that we can now reach 2.15 Gbytes/sec for NFSv3 on the original 7410 (60 second average of network throughput, which includes protocol headers.)
When I first blogged about the 7410 after launch, I was reaching 1.90
Gbytes/sec; sometime later that became 2.06 Gbytes/sec. The difference is the
software updates - we are gradually improving our performance release after
release.
8k cached read ops:
As a sanity check, we can multiply the observed NFS read ops/sec by their
known size - 8 Kbytes: 127,143 x 8 Kbytes = 0.97 Gbytes/sec. Our observed
network throughput was 1.01 Gbytes/sec, which is consistent with 127K x 8 Kbyte
read ops/sec (higher as it includes protocol headers.)
Streaming cached read:
2.68 Gbytes/sec - awesome!
8k cached read ops:
This is 75% faster than the original 7410 - this is no small hardware
upgrade! As a sanity test, this showed 223,969 x 8 Kbytes = 1.71 Gbytes/sec.
On the wire we observed 1.79 Gbytes/sec, which includes protocol headers. This is consistent with the expected throughput.
The systems tested above were the Barcelona-based and Istanbul-based 7410, both with max CPU and
DRAM, and both running the latest software (2009.Q3.) The same 41 clients were used to
test both 7410s.
The Sun Storage 7410 could support four ports of 10 GbE, with a theoretical
combined maximum throughput of 40 Gbit/sec, or 4.64 Gbytes/sec. However in practice it was
reaching about 2.06 Gbytes/sec when reading cached data over NFS. While
over 2 Gbytes/sec is fantastic (and very competitive), why not over 3 or 4
Gbytes/sec?
First of all, if you keep adding high speed I/O cards to a system, you may run out of system resources to drive them before you run out of slots to plug them into. Just because the system lets you plug them all in, doesn’t mean that the CPUs, busses and software can drive it at full speed. So, given that, what specifically stopped the 7410 from going faster?
It wasn’t CPU horsepower: we had four sockets of quad-core Opteron and
the very scalable Solaris kernel. The bottleneck was actually the
HyperTransport.
The HyperTransport is used as the CPU interconnect and the path to the
I/O controllers. Any data transferred with the I/O cards (10 GbE cards, SAS
HBAs, etc), will travel via the HTs. It’s also used by the CPUs so they can
access each other’s memory. In the diagram above, picture CPU0 accessing the memory which
is directly attached to CPU3 - which would require two hops over HT links.
A clue that the HyperTransport (and memory busses) could be the bottleneck
was found with the Cycles Per Instruction (CPI):
walu# ./amd64cpi-kernel 5
Cycles Instructions CPI %CPU
167456476045 14291543652 11.72 95.29
166957373549 14283854452 11.69 95.02
168408416935 14344355454 11.74 95.63
168040533879 14320743811 11.73 95.55
167681992738 14247371142 11.77 95.26
[...]
amd64cpi-kernel
is a simple script I wrote (these scripts are not supported by Sun), to pull the CPI from the AMD CPU PICs (Performance Instrumentation Counters.) The higher the CPI, the more
cycles are waiting for memory loads/stores, which are stalling instructions. A CPI of
over 11 is the highest I’ve ever seen - a good indication that we are waiting
a significant time for memory I/O.
Also note in the amd64cpi-kernel output that I included %CPU - CPU
utilization. With a CPU utilization of over 95%, how many of you would be
reaching for extra or faster CPU cores to improve the system? This is a problem
for all %CPU measurements - yes, the CPU was processing instructions, but it
wasn’t performing ‘work’ that you assume - instead those instructions are
stalled waiting for memory I/O. Add faster CPUs, and you stall faster (doesn’t
help.) Add more cores or sockets, and you could make the situation worse -
spreading the workload over more CPUs can decrease the L1/L2 CPU cache hit
rates, putting even more pressure on memory I/O.
To investigate the high CPI, I wrote more scripts to figure out what the
memory buses and HT buses were doing. My
amd64htcpu script shows
the HyperTransport transmit Mbytes/sec, by both CPU and port (notice in the
diagram each CPU has 3 HT ports.):
walu# ./amd64htcpu 1
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 3170.82 595.28 2504.15 0.00
1 2738.99 2051.82 562.56 0.00
2 2218.48 0.00 2588.43 0.00
3 2193.74 1852.61 0.00 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 3165.69 607.65 2475.84 0.00
1 2753.18 2007.22 570.70 0.00
2 2216.62 0.00 2577.83 0.00
3 2208.27 1878.54 0.00 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 3175.89 572.34 2424.18 0.00
1 2756.40 1988.03 578.05 0.00
2 2191.69 0.00 2538.86 0.00
3 2186.87 1848.26 0.00 0.00
[...]
This shows traffic on socket 0, HT0 was over 3 Gbytes/sec. These
HyperTransport 1 links have a theoretical maximum of 4 Gbytes/sec, which we are
approaching. While they may not be at 100% utilization (for a 1 second
interval), we have multiple cores per socket trying to access a resource that
has reasonably high utilization - which will lead to stalling.
After identifying memory I/O on HyperTransport 1 as a potential bottleneck, we were able
to improve the situation a few ways:
With these changes, our performance improved and the CPI was down to about
10. To go further, we needed HyperTransport 3.
HT3 promised to triple the bandwidth, however when I first got a prototype
HT3 system I was dissapointed to discover that the max NFSv3 throughput was the
same. It turned out that I had been sent
upgraded CPUs, but on a HT1 system. If anything, this further confirmed what I
had suspected - faster CPUs didn’t help throughput, we needed to upgrade the
HT.
When I did get a HT3 system, the performance was considerably better -
between 25% and 75%. HT links:
topknot# ./amd64htcpu 1
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 5387.91 950.88 4593.76 0.00
1 6432.35 5705.65 1189.07 0.00
2 5612.83 3796.13 6312.00 0.00
3 4821.45 4703.95 3124.07 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 5408.83 973.48 4611.83 0.00
1 6443.95 5670.06 1166.64 0.00
2 5625.43 3804.35 6312.05 0.00
3 4737.19 4602.82 3060.74 0.00
Socket HT0 TX MB/s HT1 TX MB/s HT2 TX MB/s HT3 TX MB/s
0 5318.46 971.83 4544.80 0.00
1 6301.69 5558.50 1097.00 0.00
2 5433.77 3697.98 6110.82 0.00
3 4581.89 4464.40 2977.55 0.00
[...]
HT3 is sending more than 6 Gbytes/sec over some links. The CPI was down to
6, from 10. The difference to performance numbers was huge:
Along with the CPU upgrade (which helps IOPS), and DRAM upgrade (helps
caching working sets), the 7410 hardware update was looking to be an incredible upgrade
to what was already a powerful system.
If I’ve wet your appetite for more CPU PIC analysis, on Solaris run “cpustat
-h” and fetch the document it refers to, which will contain a reference for the
CPU PICs for the platform you are on. The scripts I used above are really not
that complicated - they use shell and perl to wrap the output (as the man page
for cpustat even suggests!) Eg, the amd64cpi-kernel tool was:
#!/usr/bin/sh
#
# amd64cpi-kernel - measure kernel CPI and Utilization on AMD64 processors.
#
# USAGE: amd64cpi-kernel [interval]
# eg,
# amd64cpi-kernel 0.1 # for 0.1 second intervals
#
# CPI is cycles per instruction, a metric that increases due to activity
# such as main memory bus lookups.
interval=${1:-1} # default interval, 1 second
set -- `kstat -p unix:0:system_misc:ncpus` # assuming no psets,
cpus=$2 # number of CPUs
pics='BU_cpu_clk_unhalted,sys0' # cycles
pics=$pics,'FR_retired_x86_instr_w_excp_intr,sys1' # instructions
/usr/sbin/cpustat -tc $pics $interval | perl -e '
printf "%16s %16s %8s %8s\n", "Cycles", "Instructions", "CPI", "%CPU";
while (<>) {
next if ++$lines == 1;
split;
$total += $_[3];
$cycles += $_[4];
$instructions += $_[5];
if ((($lines - 1) % '$cpus') == 0) {
printf "%16u %16u %8.2f %8.2f\n", $cycles,
$instructions, $cycles / $instructions, $total ?
100 * $cycles / $total : 0;
$total = 0;
$cycles = 0;
$instructions = 0;
}
}
'
A gotcha for this one is the “sys” modifier on the pics definitions; they make these PICs record activity during both user-code and kernel-code, not just user-code.
I’ve previously posted
many
numbers
covering 7410 performance, although I had yet to collect the full set. I
was missing iSCSI, FTP, HTTP and many others. This hardware upgrade changes
everything - all my previous numbers are now out of date. The numbers
for the new 7410 are so far between 25% and 75% better than what I had posted
previously!
Performance testing is like painting the Golden
Gate Bridge: once you reach the end you must immediately begin at the start
again. In our case, there are so many software and hardware upgrades that
once you approach completing perf testing, the earlier numbers are out of date.
The OSSG group (who gathered the numbers at the start of this post)
are starting to help out so that we can test and share numbers more
quickly.
I’ve created a new column of numbers on my
summary
post, and I’ll fill out the new numbers as I get them.
For this 7410 upgrade, the extra CPU cores help - but it’s more about the
upgrade to the HyperTransport. HT3 provides 3x the CPU interconnect
bandwidth, and dramatically improves the delivered performance of the 7410:
from 25% to 75%. The 7410 was already a powerful server, it’s now raised the
bar even higher.
Source/Kaynak : http://blogs.sun.com/brendan/entry/7410_hardware_update_and_analyzing