Read Microsoft Word - sol_to_linux_porting_guide_v3_final.doc text version

Solaris to Linux Porting Guide

Part number: AA-RX02B-TE Published September 2009, Edition 3

© 2005, 2006, 2007, 2009 Hewlett-Packard Development Company, L.P.

Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

AMD and AMD64 are trademarks of Advanced Micro Devices, Inc. BEA and JRockit are registered trademarks of BEASystems, Inc. GNU is a trademark of the Free Software Foundation. Intel, Itanium, and Itanium-based are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the U.S. and other countries. Java is a registered trademark of Sun Microsystems, Inc. Linux is a U.S. registered trademark of Linus Torvalds. Microsoft and Windows are U.S. registered trademarks of of Microsoft Corporation. Oracle is a registered U.S. trademark, of Oracle Corporation. POSIX and IEEE are registered trademarks of the Institute of Electrical and Electronic Engineers, Inc. Sun, Solaris, and Trusted Solaris are U.S. trademarks of Sun Microsystems, Inc. SPARC is a U.S. registered trademark of SPARC International, Inc. UNIX and X/Open are registered trademarks of The Open Group. Veritas is a trademark of Symantec Corporation.

Contents

About This Guide........................................................................................................................... 5 Audience ....................................................................................................................................... 5 Organization ................................................................................................................................. 6 Related Information ........................................................................................................................ 6 Publication History ......................................................................................................................... 7 Conventions ................................................................................................................................... 7 HP Encourages Your Comments ..................................................................................................... 8 1 Value of Standards and Open Source Systems ................................................................. 10 1.1 Introduction .............................................................................................................................. 11 1.2 Linux and HP ............................................................................................................................ 11 1.3 Why Use Linux? .................................................................................................................... 12 1.4 Why Use the Linux Standard Base? ....................................................................................... 12 2 Porting Process Summary ....................................................................................................... 14 2.1 Porting Process ...................................................................................................................... 14 2.2 Porting Issues and Recommendations.................................................................................. 16 2.3 Other Considerations .......................................................................................................... 17 2.4 Operating System Version Identification ........................................................................................ 17 2.5 Porting and Coding Practices ...................................................................................................... 18 2.6 Porting Services ........................................................................................................................ 19 3 Development Environment Overview ................................................................................... 20 3.1 Compiler Overview.............................................................................................................. 20 3.1.1 C Compilers................................................................................................................... 20 3.1.2 C++ Compilers .............................................................................................................. 21 3.1.3 Java ............................................................................................................................... 21 3.1.4 Fortran ............................................................................................................................. 22 3.2 Shell and Scripting Languages Overview .......................................................................... 23 3.2.1 Shells ............................................................................................................................... 23 3.2.2 Awk ................................................................................................................................. 23 3.2.3 Perl ................................................................................................................................ 24 3.2.4 Ruby .............................................................................................................................. 24 3.2.5 Python .............................................................................................................................. 24 3.2.6 Tcl/Tk............................................................................................................................... 25 3.3 Linker and Loader Overview ............................................................................................... 25 3.4 System Library and API Overview ....................................................................................... 25 3.5 Setting Up a Linux Environment........................................................................................... 26 3.6 Third-Party Dependencies .................................................................................................... 26 3.7 Migrating Your Application ........................................................................................................ 27 4 Compilers .................................................................................................................................. 28 4.1 C Compilers .......................................................................................................................... 28 4.1.1 Language Mode Options ............................................................................................. 29 1

4.1.2 Preprocessor Macros.................................................................................................... 29 4.1.3 Optimization.................................................................................................................. 30 4.1.4 Pragmas ...................................................................................................................... 31 4.2 C++ Compilers...................................................................................................................... 34 4.2.1 Template Implementations .................................................................................................. 35 4.2.2 Template Instantiation ........................................................................................................ 35 5 Linkers and Loaders................................................................................................................. 36 5.1 Link Editor.......................................................................................................................... 36 5.1.1 Command-Line Options ................................................................................................ 36 5.1.2 Environment Variables ................................................................................................. 37 5.1.3 Library Search Order ................................................................................................... 37 5.1.4 Linker-Defined Symbols ...................................................................................................... 38 5.2 Run-Time Linking............................................................................................................... 38 5.2.1 Library Resolution ............................................................................................................... 38 5.2.2 Symbol Binding.............................................................................................................. 40 5.2.3 Environment Variables ................................................................................................. 40 5.3 Linker-Related Tools ......................................................................................................... 41 5.3.1 ldd ................................................................................................................................. 41 5.3.2 elfdump.......................................................................................................................... 42 5.3.3 ar................................................................................................................................... 42 5.3.4 lorder ............................................................................................................................. 42 5.3.5 nm ................................................................................................................................. 42 6 Other Development Tools ....................................................................................................... 43 6.1 Source Code and Version Control................................................................................... 43 6.3 Integrated Development Environments (IDEs) ...................................................................... 53 6.4 Debugging Tools.................................................................................................................. 54 6.5 Defect Tracking Tools ........................................................................................................... 55 6.6 HP Solaris-to-Linux Application Transition Tools ................................................................. 55 6.7 Shells .................................................................................................................................... 58 7 System Libraries, APIs, and Run-Time Environment ........................................................... 61 7.1 System Libraries and APIs ........................................................................................................... 61 7.2 Solaris Libraries that Are Not Available on Linux ............................................................................ 65 7.3 Solaris Header Files that Are Not Available on Linux ...................................................................... 66 7.4 Files and File Handles ................................................................................................................ 67 7.5 Networking ........................................................................................................................... 67 7.6 Math Library............................................................................................................................. 68 7.7 Developing Libraries .................................................................................................................. 70 7.8 Security APIs............................................................................................................................. 71 8 Threads ...................................................................................................................................... 72 8.1 Solaris Thread Models ........................................................................................................ 72 8.2 Linux POSIX Threads Models ............................................................................................. 73 8.3 Mapping Solaris Threads to the Linux NPTL APIs .......................................................... 74 8.4 Additional Information on LinuxThread Implementations ............................................. 76

2

9 Endian Considerations............................................................................................................ 79 9.1 Overview ............................................................................................................................ 79 9.2 Persistent Data ................................................................................................................... 80 9.3 Byte Swapping ...................................................................................................................... 84 9.4 Floating-Point Data............................................................................................................... 85 9.5 Unused Bytes......................................................................................................................... 86 9.6 Unions................................................................................................................................... 87 9.7 Initializing Multiword Entities in 32-Bit Chunks....................................................................... 88 9.8 Hexadecimal Constants Used as Byte Arrays ................................................................ 88 9.9 Other Considerations........................................................................................................ 88 10 Porting from 32 Bits to 64 Bits............................................................................................ 90 10.1 Why Port to 64 Bits? .................................................................................................... 90 10.2 Linux 64-Bit Environment ..................................................................................................... 91 10.3 Porting Issues ................................................................................................................... 93 10.4 Porting Tools.................................................................................................................... 99 11 File System and Cluster Considerations .......................................................................... 101 11 .1 File System Support .................................................................................................. 101 11.2 Logical Volume Managers (LVM) ....................................................................................... 102 11.3 Sun Clusters ­ Background Summary ............................................................................... 104 11.4 HP Serviceguard Clusters ­ A Comparison ...................................................................... 107 11.5 VERITAS Clusters ­ A Comparison .................................................................................... 110 11.6 Additional Information................................................................................................ 113 12 Linux with Security Enhanced Linux (SELinux) ................................................................ 114 12.1 Background....................................................................................................................... 114 12.1.3 Linux Distributions containing SELinux ............................................................................ 116 12.2 Type Enforcement (TE) ....................................................................................................... 116 12.3 SELinux Policy ................................................................................................................... 117 12.4 The Targeted Policy ........................................................................................................... 118 12.5 Fixing Application Problems .............................................................................................. 119 12.6 Diagnosing Access Failures............................................................................................... 120 12.7 SE Linux Policy Booleans ................................................................................................... 121 12.8 Adding a Local Policy Module .......................................................................................... 122 12.9 Tools and Utilities .............................................................................................................. 122 12.10 SELinux Policy Editors...................................................................................................... 125 12.11 SELinux API ..................................................................................................................... 126 12.12 New features in SELinux.................................................................................................. 126 12.13 Additional Information .................................................................................................... 128 13 Porting Trusted Solaris Applications to Security Enhanced Linux............................... 129 13.1 Background....................................................................................................................... 129 13.2 Terminology ...................................................................................................................... 130 13.3 Mandatory Access Policy .................................................................................................. 131 13.4 Sensitivity Labels ............................................................................................................... 132 13.5 Process Model................................................................................................................... 134

3

13.6 Root Account Usage.......................................................................................................... 135 13.7 Role-Based Access Control (RBAC) .................................................................................... 136 13.8 Auditing ............................................................................................................................ 137 13.9 Polyinstantiation ................................................................................................................ 137 13.10 Trusted Networking......................................................................................................... 137 13.11 Trusted Windows ............................................................................................................ 138 13.12 Porting Trusted Solaris Sensitivity Label Code to SELinux ................................................. 138 13.13 Porting Trusted Solaris Privileges to SELinux .................................................................... 140 13.14 Roles and Authorizations................................................................................................. 142 13.15 Putting it all Together....................................................................................................... 142 14 Virtualization........................................................................................................................ 150 14.1 HP ICE-Linux ................................................................................................................... 150 14.1.1 Why use HP Insight Control Environment for Linux?............................................ 150 14.1.2 Features and benefits ............................................................................................. 150 14.2 Virtualization on Red Hat Enterprise Linux 5 ............................................................... 151 14.3 New Virtualization Features in Red Hat Enterprise Linux 5.1 .................................... 152 15 Additional Resources .......................................................................................................... 155 A C Compiler Option Comparisons....................................................................................... 156 B C++ Compiler Option Comparisons .................................................................................. 169 C Linker Option Comparisons................................................................................................. 184 D Make Suffix Rules .................................................................................................................. 188 E Porting Checklist..................................................................................................................... 193

4

About This Guide

This guide provides information and tips on porting from Sun Solaris TM to Linux®. The primary focus of this guide is the migration of 32-bit and 64-bit applications from Solaris 8, Solaris 9, and Trusted Solaris 8 to Linux distributions with Linux Standard Base (LSB) 3.0 or higher certification on industry-standard1 HP ProLiant and HP Integrity servers. Throughout this guide, whenever Linux is referred to without any qualifying release identification, it refers to any LSB certified Linux distribution. Additionally, applications that wish to preserve Multilevel Security (MLS) features found in Trusted Solaris 8 will need to port to Linux distributions with the Security Enhanced Linux (SELinux) extensions. HP has developed this guide to help you take advantage of open systems and multiplatform environments. Technical updates and errata will be available from HP on the DSPP Linux Developer web site at

www.hp.com/go/linuxdev

Audience

This guide is intended to be a planning aid and reference for application developers, engineering managers, and product managers interested in porting their Solaris and Trusted Solaris applications to Linux. Chapters 2 and 3 provide an overview of the porting process and the Linux application development environment. These chapters are intended for all developers and key decision makers in your organization. The remaining chapters provide in-depth technical information pertaining to specific areas of porting. These chapters assume that you have a basic understanding of application development and the technical aspects of your applications on Solaris. Other documents intended for a higher-level comparison and discussion of the issues and pitfalls of migrating applications from Solaris to Linux are also available. These papers and other transition tools are available from HP on the DSPP Linux Developer web site at www.hp.com/go/linuxdev. For information on Linux solutions from HP, refer to www.hp.com/go/linux and www.hp.com/go/linuxsecurity

1

HP produces industry standard 32-bit and 64-bit servers on both Intel® and AMDTM processors.

5

Organization

This guide is organized as follows:

Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Discusses the value of standards and open source systems. Summarizes the porting process. Describes the Linux application development environment. Compares the Sun and GNU compilers. Compares the linker and loader on the Solaris and Linux systems. Discusses other development tools useful in the porting effort. Discusses differences in the system libraries, APIs, and run-time environment. Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Appendix A Appendix B Appendix C Appendix D Presents information on porting multithreaded applications. Discusses data storage differences between big-endian and littleendian systems. Compares 32-bit and 64-bit application development. Provides a feature comparison of several cluster environments. Presents information on SELinux Provide information on porting Trusted Solaris applications to Linux with the SELinux extensions. Virtualization. Lists additional porting resources. Provides a mapping of the Sun ONE Studio C compiler to the GNU C compiler options. Provides a mapping of the Sun ONE Studio C++ compiler to the GNU C++ compiler options. Provides a mapping of Solaris linker to GNU linker options. Provides a mapping of Solaris make to GNU make suffix rules.

Related Information

The HP Linux Developer web site (www.hp.com/go/linuxdev) provides papers and other migration tools, including the Migrating from Solaris to Linux paper. This paper provides a high-level comparison of the two

6

development environments and potential migration pitfalls. For readers interested in porting applications from Trusted Solaris to Linux with the SELinux extensions, additional information is available including a paper called Legacy MLS/Trusted Systems and SELinux. This paper presents the concepts and comparisons needed to simplify migration of MLS applications. HP also has developed the Solaris-to-Linux application transition tools suite. This suite includes tools for planning, porting, and deployment. This tool suite is a must for anyone porting Solaris applications to Linux. This tool suite has been extended to include support for Trusted Solaris 8 and Linux systems with SELinux extensions. More information on the HP Solaris-to-Linux application transition tools is available in Chapter 3. See Chapter 14 for a list of additional resources to help with porting applications from Solaris to Linux.

Publication History

The guide's publication date indicates its document edition. The publication date changes when a new edition is released. The document is available at

www.hp.com/go/linuxdev.

Edition Number

Publication Date

Comments

Edition 1 Edition 2

July 2005 January 2007

Initial release Include updates and extended to include porting applications from Trusted Solaris to Linux with the SELinux extensions Added Solaris 9, 10 impacts with respect to Linux version 2.6.18. Includes updates on latest Security features and virtualization on RHEL 5

Edition 3

September 2009

Conventions

This guide uses the following typographical conventions:

% $

A percent sign represents the C shell system prompt. A dollar sign represents the system prompt for the Bourne, Korn, and POSIX shells. Boldface type in interactive examples indicates typed user input. Italic (slanted) type indicates variable values, placeholders, and function argument names.

%cat file

7

Vertical ellipsis points indicate that a portion of an example that would normally be present is not shown. ... In syntax definitions, horizontal ellipsis points indicate that the preceding item can be repeated one or more times.

cat(1) A cross-reference to a manpage includes the appropriate section number in parentheses. For example, cat (1) indicates that you can find information on the cat command in Section 1 of the manpages.

[|] {|}

In syntax definitions, brackets indicate items that are optional and braces indicate items that are required. Vertical bars separating items inside brackets or braces indicate that you choose one item from among those listed.

In addition, the following conventions and terms are used: Decimal constants consist of a sequence of decimal digits (0-9) without a leading zero. Hexadecimal constants consist of the characters 0x (or 0X) followed by a sequence of hexadecimal digits (0-9abcdefABCDEF). Unless otherwise stated, these constants are represented such that the first lexical digit is the most significant (little-endian byte order). Octal constants consist of a leading zero followed by a sequence of octal digits (0-7). Unless otherwise stated, these constants are represented in little-endian byte order. Binary constants consist of a sequence of binary digits (0-1) with a trailing character b. Unless otherwise stated, these constants are represented in little-endian bit order and little-endian byte order. A byte is defined to be 8 bits.

HP Encourages Your Comments

HP welcomes any comments and suggestions you have on this guide. You can send comments to stol pg [email protected] Please include the following information along with your comments: · The full title and edition of the guide. · The section numbers and page numbers of the information on which you are commenting. · The versions of Solaris and Linux distributions that you are using.

8

The email address should not be used to report system problems or to place technical support inquiries. Please address support questions to your local system vendor or to the appropriate HP technical support office.

9

1 Value of Standards and Open Source Systems

Linux has become an enterprise operating system. Its low total cost of ownership (TCO) and similarities to existing operating systems make it an ideal option for reducing costs while integrating well with your existing environment. Linux provides both open source and commercial solutions for many of today's market segments. Migrating from Solaris to Linux reduces your TCO while increasing your deployment options. The availability of Linux on a variety of industry-standard platforms enables you to determine how to deploy your solution, rather than relying on where the operating system is supported. An additional advantage is that you can leverage HP's experience with open source systems by taking advantage of programs such as the HP Open Source Middleware Stacks (OSMS) 2 program. This program offers many solutions including: Select from supported "Building Blocks", which include best-of-breed software components. Develop your own solution with functionally-specific do-it-yourself "Blueprints" of integrated and supported middleware stacks.

Work with HP Consulting Services to solve the hardest problems. These services are available to help you generate fully customizable middleware stacks from open source as well as commercial software. With the numerous options available to you, it may seem like an overwhelming task to get started with Linux. The HP experience can help ease any transition issues you may have.

2

www.hp.com/go/osms

10

1.1 Introduction

This guide provides an overview of the process for porting applications from Solaris to Linux and from Trusted Solaris 8 to Linux with Security Enhanced Linux (SELinux) enabled. HP provides numerous resources to help you port your applications; this guide highlights several of them. Rather than using a vendor-specific implementation of Linux as your target platform, use an implementation that is certified as being compliant with the Linux Standard Base (LSB). This will enable maximum portability of your applications between different Linux vendors and versions.

1.2 Linux and HP

HP has been involved with Linux systems since the beginning, and still is today. Many of the developers of the first 64-bit Linux kernel implementation were HP employees. HP continues to have leadership roles in the Linux kernel and other core component development communities. Dedicated Linux R&D labs, with more than 20 years experience developing UNIX ® libraries and device drivers, are responsible for enabling, testing, and supporting Linux on HP workstations and servers. HP offers tested and proven solutions that work right the first time. HP delivers industry-leading hardware and software Linux solutions and maintains close partnerships with major application software suppliers. As a Linux partner, HP provides state-of-the-art management and clustering technologies for Linux, as well as broad service and support expertise for customers and ISVs. Working with HP enables you to get both your hardware and software support from a single organization. For more information on HP and Linux, visit www.hp.com/go/linux. In addition to its Linux involvement, HP is also highly active in the open source community. HP has a history of pioneering open computing and has been involved in the open source community longer than any other major hardware vendor. HP labs contribute to numerous high-profile open source applications, some of which are available from opensource.hp.com. HP is committed to the success of its enterprise customers and to driving Linux into the enterprise. Since 2003, qualified customers have obtained indemnification from HP against legal action from SCO. HP continues to extend this offer to protect and empower you to deploy Linux and open source technology in production with less risk and distraction. For more

11

information, refer to the Linux indemnification program at www.hp.com/go/linuxprotection.

1.3 Why Use Linux?

Linux has become a popular operating system, appearing everywhere from PDAs (refer to www.handhelds.org) to enterprise-level servers (refer to www.hp.com/go/proliantlinux and www.hp.com/go/integritylinux). Its open nature allows you to add or remove components easily to achieve your goals. Much of its popularity has been attributed to the familiarity of its interface and features. The command line offers a familiar interface to UNIX users, while the graphical interfaces provide Microsoft ® Windows ® users with a familiar interface. By offering multiple interfaces and system services, such as file sharing, Linux provides a low-cost, industry-standard alternative to both Windows and UNIX environments. While Solaris runs best on relatively expensive proprietary hardware, Linux is optimized to run on industry-standard hardware such as the HP Proliant and Integrity servers. The ability of Linux to run on industry-standard servers means a low cost of entry, and its low total cost of ownership (TCO) makes it attractive for both edge and enterprise servers. For example, Linux on HP BladeSystems is shown to offer significant savings over Sun Solaris running on x86 rack-mounted servers. These results include savings in the areas of hardware, software, labor and overhead. For more information, refer to h71028.www7.hp.com/ERC/downloads/TCO_Blades 7-28.pdf. Or see the savings for yourse lf. Follow the Linux TCO calc ulator lin k at www.hp.com/wwsolutions/linux/resourcecentre.html. Most important, the availability of Linux on numerous hardware platforms provides the flexibility to choose your hardware vendor without worrying about the availability of the operating system. It also enables you to choose based on performance and application availability. In a recent benchmark, HP Integrity rx6600 servers running Oracle ® set new TPC-C records for performance and price/performance in the 4 CPU/8 core class of systems. For more information on these results, refer to www.tpc.org/tpcc/results/tpcc result detail.asp?id=1 06102601.

1.4 Why Use the Linux Standard Base?

The increase in the popularity of Linux has led to many independent distributions based on the core Linux kernel. While each of these

12

distributions is Linux, each one has subtle differences. These differences have meant that supporting Linux meant supporting a limited set of distributions, each of which had to be certified independently. HP is a founding sponsor of the Free Standards Group (now the Linux Foundation, www.linux-foundation.org) and a leader in driving the Linux Standard Base (LSB), a movement to set standards for Linux so that common software applications can run on LSB-compliant distributions. With the continuing refinement of the LSB and LSB-compliant distributions, deploying your application on Linux means reduced risks, lower porting costs, and lower support costs due to expensive per-distribution certifications. By certifying your application in an LSB environment, you can now ensure that covered components will work correctly on any LSB-certified distribution. For more information on the Linux Standard Base, refer to www.linuxfoundation.org/en/LSB. The Free Standards Group published a book, Building Applications with the Linux Standard Base, which discusses developing applications using the LSB. Refer to lsbbook.gforge.freestandards.org/lsbbook.html to read it on line.

13

2 Porting Process Summary

This chapter provides information on porting from Solaris and Trusted Solaris to Linux. It is intended for developers, engineering managers, and product managers. It assumes that you have some technical knowledge of your Solaris application and the software development process.

2.1 Porting Process

Porting is the process of taking software that runs on one operating system and computer architecture and making it run on another operating system or architecture. In some cases, where the operating systems or architectures are closely related, or when the source code of the software has been designed to be portable, the port really is straightforward: recompile the sources and run the validation tests to verify that the new program operates just like the original. However, not all application ports will follow this straightforward model. The porting process then has a number of steps, which are discussed in this chapter. The emergence and adoption of standards such as POSIX.1, UNIX 95, and UNIX 98 has led to a convergence in application programming interfaces (APIs). This has greatly reduced the difficulty of porting but has not eliminated it. In practice, an application port will commonly run into two types of issues: First, to bring existing code into conformance with a standard, is an ongoing process of ever closer approximation. Once a high degree of conformance is achieved, the remaining rare discrepancies are often found only after someone doing a port stumbles on an area that may have been ambiguous in the standard, or was not exactly covered by a conformance suite, and was implemented in different ways on the two platforms. Second, some behaviors are intentionally left undefined by the standards. Applications that rely on a particular vendor's implementation of undefined behavior or on a particular vendor's extensions to a standard will require additional effort when porting.

14

A typical sequence of tasks required to port an application to a new platform is as follows: 1. Identify your product's development environment dependencies, including consideration of the following:

Tool Dependencies Examples

Source code management Build environment Installation Application management Third-party products Development Profiling and performance Test Documentation

SCCS, RCS, CVS, and Subversion

make, xmkmf, Ant, and dependency tools tar, scripting languages, and package, or RPM

SNMP facilities, PAM facilities, and kernel tuning Libraries, application servers, and application completers Compilers, debuggers, editors, and IDEs

prof, gprof, and HP Caliper

Scripting languages, diff, expect, and test frameworks TeX, javadoc, DocBook, OpenOffice.org and other applications

2. Identify and assess the likely impact of differences in the operating system features, cluster features, run-time libraries, and third-party product versions that might affect your application porting project. 3. Identify and plan for any end-user impact that might result from delivering your product on the new target system. This can include the need for data conversion, uptime support during the transition period, product installation changes, product administration changes, and training. 4. Identify and address the application porting team's need for documentation and training. 5. Port your product's development environment and adapt your development procedures as needed. 6. Clean up the application source code, identifying and removing architectural dependencies and nonstandard practices where possible. 7. Extend the application source code to support the new target platform. This can include the addition of conditional modules and conditional code (both high-level and assembler code). Generalize implementations to avoid architectural dependencies if possible. 8. Compile the source code, preferably using ANSI or more strict error15

checking switches to flag potential issues. Fix any problems found at compile time.\ 9. Run the application with a broad set of test cases, debug any problems, and correct any run-time problems that are detected. 10. Recompile the code and repeat the process as necessary.

2.2 Porting Issues and Recommendations

In general, the more compliant the application source code is with language, coding, and run-time standards, the fewer the changes that will be required. The following list presents some of the characteristics of a "wellbehaved" Solaris application that might be easy to port: Source code only uses interfaces documented as conforming to industry standards, such as IEEE Standard 1003.1:1996 (POSIX.1) and the Single UNIX Specifications. Source code conforms to the relevant ANSI/ISO language standard. Source code is already portable to multiple platforms. Source code changes will be necessary if the application has any of the following characteristics: The source code includes any assembly source modules or inline assembler statements. The source code exploits specific knowledge of the Sun hardware (for example, hard-coded assumptions about page size) or knowledge of the Solaris operating system (for example, file system layout). The source code depends on endian-specific data. The source code calls any of the Trusted Solaris APIs (they are all proprietary). The source code exploits undocumented features of Solaris. The source code is linked and runs in the Solaris kernel mode. This includes kernel modules and device drivers. Source code changes may be required if the application has dependencies on Solaris or Trusted Solaris proprietary operating system features. For example, if your application uses the exec(2) system call to

16

run a Solaris specific command, either a suitable command will need to be provided, or the application code will need to be changed. Refer to the following Web sites for more information, tools, and techniques: www.hp.com/go/linuxdev

2.3 Other Considerations

During the planning phase of an application port, you need to assess the effort to maintain and support your application's customers. In many cases, your customer support requirements will increase during the transition period. New or modified customer-initiated procedures may create special needs for your support staff. Existing remote diagnostics techniques and tools may require changes to support the new revision of your application and necessitate training both your customers and your support staff resources. During the deployment phase of a newly ported application, consider the following: Acceptance testing and application availability/uptime might need to be planned and managed. Administrative and operator documentation, end-user documentation, and training might need to be developed. Additional system backups may be needed to ensure that customers can return to a known reference point if they encounter any problems with the port. Parallel operation using both versions of the application may be warranted based on the degree of change in the application environment (hardware, software, or both), the critical importance of the application to business, the number of people, and the volume of information that the application affects. Extra procedures may be needed when the change invalidates any customer-retained archives that are required for business or legal reasons, such as accessibility to backup media.

2.4 Operating System Version Identification

The information in this guide assumes that the Solaris application was

17

developed for Solaris 8, Solaris 9, or Trusted Solaris 8 running SPARC hardware. Even if the application you are porting is based on a different version of Solaris or Sun hardware, most of the content of this guide still applies. This guide assumes that the target Linux platform is certified as being compliant with the Linux Standards Base (LSB) version 3.0 or higher. This will allow for maximum portability of your applications between different Linux vendors and versions. Information about LSB certified Linux distributions is available at www.linux-foundation.org/en/Products and includes the following products: SUSE Linux Enterprise Server 9 (SLES 9) Service Pack 3 SUSE Linux Enterprise Server 10 (SLES 10) Red Hat Enterprise Linux 4 (RHEL 4) Update 2 and above

Linux users and applications can use the lsb_release(1) command to determine the LSB version information for the Linux distribution. Note that uname(1) only returns the kernel version information. Refer to the lsb_release(1) manpage for more information. For more information on the LSB, refer to www.linuxfoundation.org/en/LSB. The Free Standards Group has published a book, Building Applications with the Linux Standard Base, which discusses developing applications using the LSB. Refer to lsbbook.gforge.freestandards.org/lsbbook.html for more information.

2.5 Porting and Coding Practices

Many of the coding practices in this guide are not specific to any one vendor, nor are they specific to the Solaris or Linux operating systems. They are simply good, standard coding practices. A number of books are available that cover similar ground. Some of the better ones include: The C Programming Language, Second Edition, by Brian W. Kernighan and Dennis M. Ritchie; Prentice-Hall Software Series, Prentice-Hall, Inc.

C Traps and Pitfalls, by Andrew Koenig; Addison-Wesley Publishing Company C++ Programming Language, by Bjarne Stroustrup; Addison-Wesley

18

Publishing Company

C++ Primer, by Stanley B. Lippman, Josée Lajoie, Barbara E. Moo; Addison-Wesley Professional

2.6 Porting Services

HP provides a full set of services for both Linux system customers (www.hp.com/hps/linux) and Linux application ISVs (www.hp.com/go/dspp and www.hp.com/go/linuxdev). Some of these services are highlighted in the following sections.

2.6.1 HP Developer & Solutions Partner Program

The HP Developer & Solution Partner Program (DSPP) is designed to help ISVs, developers, and system integrators create unique solutions across the broad spectrum of HP platforms and operating systems. The goal of the program is to match high-quality services with your development cycle, marketing, and sales needs. By registering on the DSPP portal at www.hp.com/go/dspp, your free membership lets you: Call program centers to talk with DSPP representatives. Purchase HP equipment at discounted prices to support your development, testing, and application demonstration work. Obtain technical assistance for development issues, such as questions about compilers. Obtain sales and marketing support. For example, list your software product in the HP Linux Solutions Catalog (follow the "Partner resources" link at www.hp.com/go/dspp). Visit www.hp.com/go/linuxdev to learn about the following DSPP resources for Linux developers: Request migration and porting assistance. Reserve physical or virtual system access at an Application Migration & Testing Center. Download software tools, both from HP and third parties. Attend webcast training and seminars.

19

3 Development Environment Overview

Linux provides a rich environment of application development tools, including both open source and commercial offerings. This chapter provides an overview of the Linux development environment. Additional details on compilers, linkers, and other development tools are discussed in the following chapters.

3.1 Compiler Overview

Linux developers primarily use the GNU Compiler Collection (GCC). Other free and commercial Linux compilers are available. For example, Intel ® provides C, C++, and Fortran compilers for Linux. These Intel® compilers are available free for noncommercial use, but are licensed for commercial use and they produce highly optimized output for IA-32 and Intel ® Itanium ® - based systems. GCC is included with most Linux distributions. The Trusted Solaris development environment uses the same development tools and utilities as Solaris. The main difference is that Trusted Solaris has some additional libraries, header files, and behavioral differences that are not available on Solaris. See Chapter 13 for more information.

3.1.1 C Compilers

The Sun Forte, GNU and Intel® C compilers all support the C89 language features as well as some C99 features. For details on the C99 features supported by the GNU C compiler, refer to the following sites for version 3.4 and 4.1 respectively: gcc.gnu.org/gcc-3.4/c99status.html gcc.gnu.org/gcc-4.1/c99status.html Code written in strict compliance with ANSI C standards will have the greatest level of portability and the most consistent behavior across platforms. One of the challenges in porting C source code from Solaris to Linux will be in updating the compiler options used in makefiles. More details on the differences between the Sun and Linux compilers can be found in Chapter 4 of this guide. For a mapping of C compiler options from Sun C to gcc

20

and Intel C, see Appendix A. You can also consider using the Solaris-toLinux Porting Kit (SLPK). See Section 3.5.3 for more information.

3.1.2 C++ Compilers

The Sun, GNU G++, and Intel® C++ compilers all conform to the ANSI/ISO C++ International Standard ISO/IEC 14882:1 998. The default implementations of iostream and the Standard Template Library (STL) used in the current versions of the Sun, G++, and Intel ® compilers are compliant with the 1998 C++ standard. For more information on the G++ run-time libraries, including extensions and backward compatibility, refer to gcc.gnu.org/onlinedocs/libstdc++/faq/. Detailed information on C++ compilers is available in Chapter 4 of this guide. For a mapping of Sun to G++ and Intel C++ compiler options, see Appendix B in this guide. For information on compatibility between the Intel® and GCC compilers, refer to the Intel® Compilers for Linux: Compatibility with GNU Compilers paper at www.intel.com/software/products/compilers/techtopics/ LinuxCompilersCompatibility.htm.

3.1.3 Java

JavaTM on Linux provides a stable and robust platform for developing and deploying applications, and is included with most Linux distributions. A number of different vendors offer Java development kits for Linux on 32-bit and 64-bit systems. Several implementations are available depending on your requirements3: Sun offers a development kit and Java Virtual Machine (JVM) for Linux. The current version of the Sun offering is 6.0, and you can download it for 32-bit and 64-bit Linux on some Linux platforms, including IA-32,

3

HP is only providing a list of options. HP does not endorse a specific vendor.

21

BEA® offers a highly optimized JVM for the Linux platform. The BEA JRockit® JVM offers just-in-time (JIT) compilation, unique optimizations, thread management, and adaptive memory management. JRockit is available for IA-32, Intel® EM64T, AMD64TM , and Intel® Itanium®-based systems. Refer to www.bea.com /jrockit for more information. IBM offers the IBM Developer Kit for Linux, which includes a JIT compiler and a mixed-mode interpreter that uses the JIT compiler only for frequently called methods. Refer to ibm.com/java/jdk/ for more information.

Another Java development tool for the Linux platform is the GNU Compiler for Java (GCJ). The GCJ, included with GCC, is a portable, optimizing, ahead-of-time compiler for the Java Programming Language that can compile Java source code to Java bytecode or directly into native machine code. The libgcj library, which is the GCJ run-time library, provides the core class libraries, a garbage collector, and a bytecode interpreter. It can dynamically load and interpret class files, resulting in mixed compiled and interpreted applications. Refer to gcc.gnu.org/java for more information.

3.1.4 Fortran

GNU Fortran support for the Fortran 95 standard is under development,, and the current implementation is available in GCC 4.1 .1. Refer to gcc.gnu.org/fortran and www.g95.org for more information. GNU Fortran is included with most Linux distributions. If your application requires complete Fortran 95 functionality, you need to purchase a commercial implementation of the compiler. Several implementations are available, depending on your requirements4: Intel® Fortran Compiler (Fortran 95). For more information, refer to www.intel.com/software/products/compilers/flin/. Pro Fortran (Fortran 95 with F77, F90, and F95 modes) from Absoft Development Tools. For more information, refer to www.absoft.com/ Products/Compilers/Fortran/Linux/fortran95/. F77, F90/F95, and HPF from The Portland Group. For more information, refer to www.pgroup.com/products/serverindex.htm.

4

HP is only providing a list of options. HP does not endorse a specific vendor.

22

F77, F90, and F95 from Pathscale. For more information, refer to www.pathscale.com/ekopath.html. VAST/f90 from Crescent Bay Software. For more information, refer to www.crescentbaysoftware.com.

3.2 Shell and Scripting Languages Overview

Linux provides a large number of shell and scripting languages to meet the needs of Solaris applications being ported to Linux. Because Solaris versions for many of these Linux shells and scripting languages are available, your porting team can test these tools on Solaris before porting to Linux. This section provides a brief overview of some of the more common Linux shell and scripting languages and provides pointers to more information. These shells and scripting languages are included with most Linux distributions.

3.2.1 Shells

Many Solaris applications include either Bourne (sh), Korn (ksh), or C Shell (csh) scripts. These applications should be able to find a compatible Linux shell to run these scripts without changing shell dialects. For example, if shell scripts and users are using the csh on Solaris, they can use the Turbo C Shell (tcsh) on Linux. When creating a new user account on Linux, most distributions configure the Bourne Again SHell (bash) as the default shell. Additional information on Linux shells is available in Chapter 6.

3.2.2 Awk

The awk language was named in the mid 1 970s after its developers, Alfred V. Aho, Peter J. Weinberger, and Brian Kernighan. Essentially awk is a report-generating language that uses extensive pattern matching. Several versions of awk are available for Linux, but the most popular today is the GNU awk, (gawk). Most Solaris awk programs should port easily to gawk. Because awk is required by POSIX, gawk is installed by default on Linux systems. For more documentation, downloads, and other information on gawk (1), refer to www.gnu.org/software/gawk/gawk.html.

23

3.2.3 Perl

Perl was originally designed as a glue language for UNIX and as a replacement for the sed (1) and awk (1) commands. It is highly portable and extensible. The most common version of Perl currently available is Perl 5. Perl 5 is used on both Solaris and Linux, although Linux tends to contain more recent versions. Most Perl 5 programs written on Solaris should port with a minimum of trouble to Linux. The one difficulty that exists in many Perl scripts is the first line of the script, which contains the location of the Perl interpreter. Most systems today have Perl installed as an integral part of the system in /usr/bin. A common trick is to use the env(1) command to search the current value of PATH for the Perl executable. This can be done by changing the line in your script that identifies it as a Perl code from #!/usr/bin/perl to #!usr/bin/env perl. Most Linux distributions include some of the more popular Perl modules. To check on the availability of other Perl modules, refer to the Comprehensive Perl Archive Network (CPAN) at www.cpan.org. For more documentation, downloads, and other information, refer to

www.perl.org.

3.2.4 Ruby

Ruby is a reflective, object-oriented programming language. It combines syntax inspired by Perl with Smalltalk-like object-oriented features, and also shares some features with Python and Lisp. Ruby is a single-pass interpreted language. Its main implementation is under an open source license. Both Solaris and Linux support the Ruby interpreter. The current version of Ruby is 1 .8.x for both Solaris and Linux. Ruby scripts can be executed using either the Ruby interpreter directly from the command prompt and passing in a Ruby script as a parameter, or as a shell script using the path to the ruby interpreter (i.e. #! /usr/bin/ruby). As with Perl, ensure that the location of the Ruby interpreter is defined correctly on the first line of your shell scripts. For more information, refer to www.ruby-lang.org/en.

3.2.5 Python

Python is an interpreted, interactive, object-oriented programming language. Most Linux distributions come with Python bundled. There are some backward compatibility issues between Python 1 and Python 2. The best

24

procedure for porting Python code from Solaris to Linux is to install the same version of Python on Linux as is installed on the Solaris platform. Multiple versions of Python can exist at the same time on the system. If you have code that is dependent on a specific version of Python, the older versions of Python are available for Linux and can be installed on the Linux platform. For more documentation, downloads and other information, refer to www.python.org.

3.2.6 Tcl/Tk

Tcl/Tk is a scripting language that has been very popular for graphics programming. In recent years it has fallen out of vogue in favor of Python. Tcl/Tk is stable, and Solaris Tcl/Tk scripts should port easily to Linux. Tcl/Tk is included in most Linux distributions, but it may not be installed by default. For more documentation, downloads, and other information, refer to tcl.sourceforge.net.

3.3 Linker and Loader Overview

Many of the Solaris linker and run-time interpreter (also known as the runtime linker, or loader) features are available on Linux using the GNU binutils components. However, review and modification of the product build and run-time product configuration should be anticipated as part of the application port. Refer to www.gnu.org/software/binutils/ for more information. HP recommends using the compiler to perform the final link, instead of invoking the linker directly. When you need to provide special options or you want to see more linker report information, use the -Wl, option[ , option] compiler option to pass options to the link editior. See Appendix C as well as the compiler and ld (1) manpages for more information. Detailed information on the Linux linker and run-time loader including library search order and symbol resolution is available in Chapter 5 of this guide. For a mapping of the Solaris to Linux linker options, see Appendix C.

3.4 System Library and API Overview

Like Solaris, the Linux development environment supports both the ILP32 and

25

LP64 addressing models, depending on the architecture. ILP32 refers to systems where integer, long, and pointer data types are all 32 bits long. LP64 refers to systems were long and pointer data types are 64 bits, but integers are not. The Linux environment supports5 many of the UNIX development standards, such as the POSIX and X/Open® standards. Interfaces specific to Solaris are not generally available; however, packages exist that extend the Linux environment to provide compatibility with some Solaris interfaces and behaviors. One such package is the Solaris-compatible Threads Library developed by HP. Refer to www.sourceforge.net/projects/sctl for more information. Additional information on these and related topics is available in Section 3.5.3, Chapter 6, and Chapter 7.

3.5 Setting Up a Linux Environment

The HP Linux and open source Web site, www.hp.com/go/linux, provides numerous guides on enabling HP specific hardware found in both the HP ProLiant and HP Integrity servers under Linux. The "Getting Started with Linux on HP Servers" whitepaper provides a high-level description of the initial installation process on both HP ProLiant and Integrity servers. This paper is available from the DSPP Linux site: h21007.www2.hp.com/dspp/files/unprotected/linuxjw.pdf. Refer to www.hp.com/go/linuxdev for more information.

3.6 Third-Party Dependencies

A key part of migrating your application is to create a complete list of all dependencies and determine their availability in the target environment. If the specific application release you are depending on is not available, you need to determine if an alternate version or application can resolve your dependency. When planning your schedule, you need to include time to migrate to these new versions or applications. HP provides a list of ISV application availability at www.hp.com/go/linuxsolutionpartners. Check with the vendor to get the latest information.

26

3.7 Migrating Your Application

After you determine what tools you need and how you are going to migrate your environment, you can begin to migrate the application itself. Because of the nature of the open source utilities, you have a choice of two approaches: Migrate your existing Solaris environment to one that uses the new tools and compilers on Solaris. After completing this step, you then migrate the code to the new Linux environment. Migrate your application and environment to Linux at the same time.

Both methods have advantages and disadvantages. The first method is effectively two migrations, but it has the advantage of creating a more direct comparison to validate the migration. The second method results in a single migration and a clean break from the Solaris environment; however, it will be harder to distinguish issues related to the tools from those related to the environment.

27

4 Compilers

The Linux development environment includes numerous compilers, both specialized and general. This chapter compares the Sun Studio 12 C and C++ compilers with the GNU Compiler Collection (GCC) C and C++ compilers and the Intel Compilers for Linux. The GCC compiler suite supports C, C++, Java, and FORTRAN 77 as its core. Additional plug-ins and modules are available for many other languages. GCC is included with most Linux distributions. Refer to gcc.gnu.org for more information. GCC is available on many platforms, including Solaris. One possible strategy for migrating from Solaris to Linux is to migrate first to the GNU compilers on Solaris and then later migrate to Linux. This enables you to isolate much of the platform-dependent modifications from the compilerspecific ones. While this is a good strategy, this chapter focuses on a different strategy: migrating directly from the Sun Studio compilers on Solaris to the GNU and Intel compilers on Linux. One of the other compilers available for Linux, which has a high level of compatibility with the GNU compilers, is produced by Intel. Intel produces a C++ compiler, which also supports C, and a Fortran 95 compiler. These compilers are available free for noncommercial use, but are licensed for commercial use. You can find more information on the Intel® compilers at: www.intel.com/software/products/compilers/linux. An Intel white paper on compatibility with GNU compilers is available at: www.intel.com/software/products/compilers/techtopics/ LinuxCompilers Compatibility.htm.

4.1 C Compilers

The information in this section is based on the Sun Studio 12 C compiler for Solaris, the GNU Compiler Collection (GCC) 4.1.2 C compiler for Linux, and the Intel C++ Compiler 9.1 for Linux. The GCC 4.4.1 and Intel C++ 11.1 compilers are the most recent versions available at the time of this writing. HP recommends that you use the most current GCC patches and latest compilers supported by your Linux distribution. Write code in strict conformance with ANSI C whenever possible to provide the greatest level of portability and the most consistent behavior across all platforms. The following sections discuss these compilers in more detail.

28

4.1.1 Language Mode Options

The Sun, GNU C (GCC), and Intel compilers all support the C89 language features described in the C Standard ISO/IEC 9899-1990, as well as some C99 features from ISO/IEC 9899-1999. For the Sun compiler, the C99 features are enabled by default (refer to the -xc99=%all compiler option in the cc (1) manpage for more information). For the GCC compiler, you must explicitly specify the ­std=c99 or -std=gnu99 option to enable this support. The Intel C compiler provides the -std=c99 option. All three platforms accept extensions to the standard by default. The Sun compiler supports K&R compatibility extensions, and GCC and Intel support GNU extensions. For GCC, only the preprocessor supports K&R mode and the Intel compiler does not include K&R support, so applications that rely on the -Xs option on Solaris might need to be updated in order to compile successfully on Linux. Table 4-1 compares the language mode options for the Sun C, GCC, and Intel compilers. For a complete list of C compiler option mappings, see Appendix A. Table 4-1 C Language Mode Options

Sun C -Xc -Xa (default) -Xt -Xs GCC and Intel GCC: -std=c99 -pedantic Intel: -std=c99 -Wcheck -std=c99 Description

C99 subset C99 + K&R C (C99 prevails) C99 + K&R C (K&R C prevails) K&R C C89 + GNU extensions C99 + GNU extensions C89

No equivalent No equivalent

-std=gnu89 (default) -std=gnu99 -std=c89 (same as -ansi)

No equivalent No equivalent

-xc99=%none

4.1.2 Preprocessor Macros

Table 4-2 lists the preprocessor macros that are defined by the three compilers. To display the full list of macros set by each compiler, use cc # with the Sun C compiler, gcc -dM with GCC, and icc -E -dM with the Intel compiler. Note that the Sun compiler sets __STDC__=1 only when the -Xc option is used, while the GCC and Intel compilers set

29

__STDC__=1 by default. Table 4-2 Defined C Language Preprocessor Macros

Sun C

__SunOS _<version> __SUNPRO_C=<version> __SVR4 __unix __sun __sparc

GCC and Intel

__linux

and

__linux__

and

linux

GCC & Intel: __VERSION__ "<version>" Intel: __INTEL _COMPILER <version>

No equivalent

__unix

and

__unix__

and

unix

No equivalent

__i386

or __ia64 or __amd64 or __x86_64

__STDC__=1

(for

-Xc

only)

__STDC__=1

(default)

4.1.3 Optimization

Table 4-3 compares the compiler optimization levels available with the Sun, GCC, and Intel C compilers. Sun offers five levels of optimization through the -xO[ n] option, where n=1-5. The GCC and Intel compilers offer four levels of optimization through the -O[ n] option, where n=0-3. Optimization is off by default for both the Sun and GCC compilers. The default is -O2 for the Intel compiler. Note that -O is equivalent to -xO2 for the Sun C compiler, -O1 for GCC, and -O2 for the Intel C compiler.

30

Table 4-3 C Compiler Optimization Comparisons

Sun C Optimization Level

-xO1

Sun C Description

GCC and Intel GCC Description Optimization Level

GCC: O[ 1] Intel -O GCC: -O2 Intel: -O[

Basic local optimization (peephole) or

-O

Reduce code size and execution time without increasing compile time

2

-xO[ 2]

Basic local and global optimization Loop unrolling and software pipelining Inline functions in same file Use with

-xprofile=p

-xO3

No equivalent

-O3

Optimizations that do not involve space-speed trade-offs

-xO4

Inlining and register allocation optimizations

-xO5

No equivalent

4.1.4 Pragmas

GCC does not support most of the pragmas provided by the Sun C compiler, but it does provide the __attribute__ keyword to add information to declarations or definitions of objects. Note that GCC simply ignores unrecognized or unsupported #pragma statements, so be careful to convert all of them to __attribute__ keywords or wrap these platform-specific pragmas in conditional code. You can use the gcc -Wall compiler option to warn about ignored #pragma statements. The __attribute__ keyword is followed by two open parentheses. This enables you to define a macro that can be used if the compiler does not understand attributes, as shown in the following example:

#define __attribute __(ignore)

You can specify attributes with preceding and trailing double underscores. This enables you to use them without causing any naming conflicts with macros of the same name. Table 4-4 provides a mapping from the Sun C pragmas to the corresponding GCC attributes. The table also shows the use of the double underscore naming convention.

31

Table 4-4 Sun C Pragmas and GCC Attributes

Sun C Pragma Align GCC Attribute __aligned__ Description

Specifies alignment, in bytes, for variables listed Specifies initialization functions Specifies finalization functions Defines weak symbols symbol (for GCC, use the inline assembler statement) Puts the string in the .comment section Marks the function that returns an unsigned value as returning an int Instructs the compiler that there are no memory dependencies for any of the iterations of a loop Declares that the named functions have no side effects Specifies that the named structure is laid out without padding Specifies the minimum dependence distance of the loop-carried dependence Specifies names of functions that violate normal control flow properties (for GCC, __noreturn marks functions that never return) Suggests an unroll factor for a loop Find implicit function declaration Find function declarations which contain no parameter-type

Init Fini Weak redefine_extname

__constructor__ __destructor__ __weak__

extern int oldname(int) Assigns a new external name to a __ asm__ ("newname")

Ident

Use #ident "string" No equivalent Use C99 restrict keyword for loop pointers and arrays

__pure__ or __const__ __packed__

int_to_unsigned

Nomemorydepend

no_side_effect

Pack

Pipeloop

No equivalent

unknown_control_flo __noreturn__ w

Unroll

No equivalent

C99 (implicit|no No equivalent implicit) [no_]warn_missing_p No equivalent arameter_info

One of the more common uses of the Solaris pragmas is illustrated as follows. In this example, the Solaris pragma pack specifies that the named structure is to be laid out by the compiler without padding.

#pragma pack(1) sruct foo { ... }; #pragma pack(0) /* Set alignment to one byte */ /* declare the structure(s) to pack */ /* Reset to default alignment */

32

GCC way of expressing this is:

/* declare a packed structure */ struct foo { ... } __attribute__ ((__packed__));

This form will pack the structure to the most compact form. However, GCC enables you to express even more. You can force individual structure members to be packed, while other members are aligned in the normal way. The syntax for this is similar to the following:

/* pack only the second field of struct foo */ struct foo { char aaa; short int bbb __attribute__ ((__packed__)); short int ccc; };

In this case, the member bbb is not aligned but instead follows immediately the member aaa in memory at offset 1. The member ccc is aligned and follows on offset 4. The Intel compiler supports some of the pragmas provided by the Sun C compiler. Table 4-5 provides a mapping from the Sun C pragmas to the corresponding Intel pragmas.

Table 4-5 Sun C and Intel C Pragmas

Sun C Pragma

Align

Intel C Pragma

force_align

Description

Specifies alignment, in bytes, for variables listed Specifies initialization functions Specifies finalization functions Defines weak symbols Assigns a new external name to a symbol (for Intel, use inline assembler statement) Puts the string in the section

.comment

Init Fini Weak redefine_extname

init_seg No equivalent weak extern int oldname(int) __ asm__ ("newname")

Ident

ident

int_to_unsigned

No equivalent

Marks the function that returns an unsigned value as returning an int

Nomemorydepend

Use C99 restrict keyword Instructs the compiler that there are for loop pointers and arrays no memory dependencies for any of the iterations of a loop

33

Sun C Pragma No_side_effect

Intel C Pragma

Description

No equivalent See -fpack-struct

compiler option

Declares that the named functions have no side effects Specifies that the named structure is laid out without padding Specifies the minimum dependence distance of the loop-carried dependence Specifies names of functions that violate normal control flow properties Suggests an unroll factor for a loop

Pack

Pipeloop

No equivalent

unknown_control_flow No equivalent

Unroll

No equivalent (See -unroll [n] compiler option) No equivalent No equivalent

C99 (implicit|no implicit) [no_]warn_missing_par ameter_info

Find implicit function declaration Find function declarations which contain no parameter-type information

4.2 C++ Compilers

The information in this section is based on the Sun Studio 12 C++ compiler for Solaris, the GNU Compiler Collection (GCC) 4.1.2 C++ compiler for Linux (G++), and the Intel C++ Compiler 9.1 for Linux. The G++ 4.4.1 and Intel C++ 11.1 compilers are the most recent versions available at the time of this writing. To obtain the latest functionality and bug fixes, HP recommends that you use the current release of the compiler, available from gcc.gnu.org or www.intel.com/software/products/compilers/linux. The Sun C++, GNU G++ and Intel C++ compilers all conform to the ANSI/ISO C++ International Standard ISO/IEC 14882:1998. Write your code to conform to the ANSI C++ standard for ease of porting and consistency across platforms. For a mapping of Sun C++ to G++ and Intel C++ compiler options, see Appendix B. The preprocessor macros defined for Sun C++, G++ and Intel C++ are similar to those defined for the C compilers on the three platforms. See Table 4-2 for more information. For Sun, display the macros by compiling with the -v option, for G++, use the -dM option, and for Intel C++, use the -E -dM options.

34

The default implementations of iostream and the Standard Template Library (STL) used by the current versions of the Sun, GNU and Intel compilers are compliant with the 1 998 C++ standard. For more information on the G++ run-time libraries, including extensions and backward compatibility, refer to gcc.gnu.org/onlinedocs/libstdc++/faq.

4.2.1 Template Implementations

The following template features are similar on Solaris and Linux: Both allow template specializations. Both allow template arguments to be omitted when the compiler can infer them. Both implement template members and template partial specialization. Both allow default template parameters.

4.2.2 Template Instantiation

For Sun Studio 8 and higher, the Sun, GNU C++, and Intel C++ compilers implement the same template instantiation methods by default. The default is that each translation unit will contain instances of the templates it uses (the -instances=global option for Sun). In a large application, this can lead to a fair amount of code replication. The Sun and G++ compilers provide the option to use a repository (via the -instances=extern option for Sun and the -frepo option for G++). This reduces the total size of object files, including any within the template cache. The Sun Forte Developer 7 C++ compiler uses a repository by default. For applications that are built with this version of the compiler, use the G++ option -frepo to employ a similar template instantiation method on Linux.

35

5 Linkers and Loaders

This chapter discusses the differences in the linker and run-time loader. For more information on these topics, refer to the relevant manpages and the GNU binutils info(1) page, which are available on your Linux system and on line at www.gnu.org/software/binutils/.

5.1 Link Editor

HP recommends using the compiler to perform the final link. This is because the compiler may silently attach additional run-time libraries or options that you do not explicitly provide on your link line. When you need to provide special options or you want to see more link editor report information, use the compiler's -Wl, option[ , option] linker option to pass options to the link editor. See Appendix C as well as the compiler and ld (1) manpages for more information. The term linker is generally used when discussing the link editor. The following sections describe the linker in more detail.

5.1.1 Command-Line Options

Many of the linker options you are familiar with on Solaris are the same on Linux. The GNU linker was developed to function in a number of environments, and it includes mappings for many common operating system linkers to its own native form. Some of the more commonly used linker options are shown in Table 5-1. For a complete mapping of the Solaris linker options to their GNU linker equivalents, refer to the Appendix C.

36

Table 5-1 Commonly Used Linker Options

Solaris Option -G -h name -l libname -L path -o outfile -R path -s -V Linux Option -shared -h name -l libname -L path -o outfile -rpath path -s -V Description

Produce shared object (library) Set internal library name as name Include library libname in link Add path to library search path Produce an object file outfile Use path to locate dependencies at run time Strip symbolic information Print version of ld(1)

5.1.2 Environment Variables

Unlike Solaris, Linux does not use the LD _LIBRARY _PATH or LD _LIBRARY _PATH _64 environment variables during link. This prevents problems caused by library paths required at run time for other executables from interfering with the link. The LD _RUN _PATH environment variable functions the same on Solaris and Linux. If it is set, the value will be stored as a run-time library search path. GNU ld(1) does not support the use of LD_OPTIONS or SGS_SUPPORT environment variables. For applications invoking the linker from make (1), the GNU make does provide default rules for the linker that utilize the LDOPTS environment variable for default linker options.

5.1.3 Library Search Order

The search algorithm used to locate libraries specified on the link line differs between Solaris and Linux. GNU ld (1) attempts to locate libraries specified with the -l option by searching: 1. LD _RUN _PATH if -rpath was not used

2. Directories specified by ­L 3. System default directories

The system default directories include /lib, /usr/lib, /usr/local/lib and any directories specified during the creation of GCC or binutils. This last element can vary but should not impact any applications, because these files are part

37

of the core development environment and therefore should not contain any no internal libraries. The directory list search path is:

1. /usr/*-linux/lib 2. /usr/local/lib 3. /lib 4. /usr/lib

5.1.4 Linker-Defined Symbols

The Solaris object file format is ELF. Linux also uses the ELF format by default on most platforms today. Since all linker-defined symbols are defined by the ELF standard, there is good compatibility between Solaris and Linux in this regard. Detailed information on what symbols are defined and what they mean is available as part of the Linux Standard Base documentation set, at www.freestandards.org/en/LSB.

5.2 Run-Time Linking

The run-time linker is frequently referred to as the interpreter on Solaris. Due to its nature, Linux supports multiple object and executable formats. Today, only ELF is standard; however, the kernel still provides support for several other binary formats and provides a system to register additional formats. Refer to www.tldp.org/LDP/intro-linux/html/x9025.html for more information on the binfmt misc kernel module. The term loader6 is generally used when discussing the run-time linker. This helps to distinguish it from the link editor. The following sections describe the run-time linking process in more detail.

5.2.1 Library Resolution

To resolve any unresolved symbols in the image, the loader constructs a list of shared objects using the information contained in the executable and its dependencies. Both Solaris and Linux provide configuration files for the loader. These configuration files determine the default search paths used when locating libraries. Solaris uses crle(1)to configure and display

6

Sometimes this is further qualified as dynamic loader or run-time loader.

38

these files. On Linux, the ldconfig (8) command provides this functionality. Detailed information on the Linux loader is available in the form of both manpages and info(1) pages for ld.so. For Linux distributions that support both 32-bit and 64-bit libraries within the same architecture, a simple configuration file is used. This differs from Solaris, where independent configuration files are used for 32-bit and 64-bit libraries. On Solaris, 32-bit libraries are generally located in * /lib, and the 64-bit versions are located in * /lib/64. On Linux, the defaults depend on the architecture. For 32-bit x86 systems, libraries are generally located in * /lib. To enable 64-bit x86 systems to support both 32-bit and 64-bit environments, there are two sets of libraries, which are generally located in * /lib for 32-bit and * /lib64 for 64-bit. On Intel® Itanium®-based systems, which use 64-bit libraries, libraries are generally located in * /lib. Table 5-2 summarizes this information. The process of actually locating libraries to meet the executable's shared library requirements is similar between Solaris and Linux: 1. LD _LIBRARY _PATH is searched, unless this is a setuid(2) or setgid(2) binary. Unlike Solaris, Linux does not use a separate environment variable (LD_L IBRARY _PATH 64) for 64-bit executables. 2. The embedded search path stored at link time is searched. This path is defined using the linker -rpath option. 3. Directories specified by the ldconfig (8) configuration file are then searched. 4. Finally, the system default libraries are searched unless the binary was linked with the -z nodeflib option.

Table 5-2 Default Linux System Library Locations

Application Architecture x86 x86 on x86_64 x86_64 Intel® Itanium®-based systems Default System Library Locations

/lib/ /usr/lib/ /lib/ /usr/lib/ /lib64 /usr/lib64 /lib/ /usr/lib

39

Linux also provides several emulators to allow you to run non-native binaries. Each of these has specific rules used to locate Dependant libraries. Usually this involves an environment created with chroot(1). For information on these situations, refer to the emulator documentation.

5.2.2 Symbol Binding

Both Solaris and Linux perform lazy symbol binding by default. This means that the symbols are not actually resolved until the first call to it. This behavior does introduce a slight delay on the reference to a symbol. However, it also provides a significant performance gain during startup since only symbols that need to be resolved will be. For applications with lots of symbols the performance gain can be significant. In addition to the default lazy binding, both Solaris and Linux can bind all symbols at application startup. Do this by setting the LD _BIND _NOW environment variable to a non-null string or by using the -z now option during link.

5.2.3 Environment Variables

The loader on both Solaris and Linux supports several environment variables for modifying the behavior of the loader as listed in Table 5-3. Some of these variables are shared; however, some are not portable between the two platforms.

Table 5-3 Mapping of Solaris Loader Environment Variables to Linux

Solaris Environment Variable

LD_AUDIT LD_AUDIT_64 LD _BIND _NOW LD_CONFIG LD_DEBUG LD _DEBUG _OUTPUT LD_FLAGS LD _LIBRARY _PATH LD_LIBRARY_PATH_64

Linux Availability No equivalent No equivalent Same functionality No equivalent Same functionality Same functionality No equivalent Same functionality No equivalent, use

LD _LIBRARY _PATH

40

Solaris Environment Variable

LD_LOADFLTR LD_NOAUXFLTR LD_NOCONFIG LD_NODIRCONFIG LD_NOOBJALTER LD_NOVERSION LD_ORIGIN LD_PRELOAD

Linux Availability No equivalent No equivalent No equivalent No equivalent No equivalent No equivalent No equivalent Same functionality, Linux also a provides /etc/ld. so .preload for system wide use Same functionality Same functionality

LD_PROFILE LD _PROFILE _OUTPUT

5.3 Linker-Related Tools

When debugging an application porting problem, determining the source of a problem sometimes means getting into the internals of the binary image. Tools such as those discussed in the following sections can assist you in determining where various symbols are being resolved from, what dependencies your application may have, and where any symbol collisions may be occurring. These linker tools are included with most Linux distributions.

5.3.1 ldd

The ldd(1) tool displays the dependencies of a call-shared executable or a shared library. It can also be used to report the library path and other information used by the system loader to resolve the application dependencies. This tool takes into account information such as the current environment settings and the image's rpath, if any, when resolving the libraries. The Linux implementation of ldd(1) is a shell script that invokes ld.so with special options to offer a robust set of features. The resulting output is similar to the output of the Solaris ldd(1) command, but it has some differences in output formatting and it provides some additional information. In addition to the command-line options, to use all of the features of ldd (1) on Linux, you will need to set the LD_DEBUG environment variable to values such

41

as libs. The libs option produces output that is roughly equivalent to ldd -s on Solaris. Refer to the Linux ldd (1) and ld. so (8) manpages for details. Run the following command to see the supported LD_DEBUG options:

$ LD_DEBUG=help ldd Valid options for the LD _DEBUG environment variable are: libs display library search paths reloc display relocation processing files display progress for input file symbols display symbol table processing bindings display information about symbol binding versions display version dependencies all all previous options combined statistics display relocation statistics help display this help message and exit

5.3.2 elfdump

On Solaris, the elfdump(1) utility is used to symbolically dump selected parts of images and libraries. Linux provides similar functionality through the objdump(1) tool. In addition to ELF images, objdump(1) supports a number of other object formats. The exact list of supported formats depends on the architecture and distribution you are running.

5.3.3 ar

Use the ar(1) utility to combine multiple object files into a single archive library, primarily for ease of use. Both Solaris and Linux provide similar ar(1) commands.

5.3.4 lorder

You can use the lorder(1) utility on Solaris to attempt to determine the most efficient order in which to specify libraries on the link line to resolve all dependencies. This utility is no longer required on Solaris because the Solaris linker will perform multiple passes to resolve all symbols. The Linux linker also performs multipass links. Since the lorder(1) utility has been retired on Linux, use should only be considered if it is needed for compatibility with your Solaris application build.

5.3.5 nm

The nm(1) utility displays the symbol tables from the specified image. Both Solaris and Linux support the same basic command lines for the nm(1) command. Additional nonstandard options may differ. Minor differences in output formatting may also occur.

42

6 Other Development Tools

In addition to compilers and linkers, you need other tools to complete the development environment. And nothing better demonstrates the growing importance of the Linux platform than the rich and growing set of tool options available on Linux. This chapter discusses some of these other development tools and provides links to additional information.

6.1 Source Code and Version Control

An important role of software configuration management (SCM) during an application port is managing the versioning and integrity of application source code, and ensuring that change management policies are in place and actively utilized. To fulfill the source code versioning requirements, a class of software called version control software was developed. Within this class of software, Sun provides Source Code Control System (SCCS) with its Solaris operating system for managing source code on that platform. In addition to SCCS, Sun also carries a product line known as Forte Code Management Software (formerly TeamWare) built on top of SCCS, which provides many additional features. In addition to these products, many thirdparty commercial and open source version control products are available for the Solaris operating system. Table 6-1 provides a list of popular open source version control software products and lists their availability on Solaris and Linux. If you are using a commercial version control software product on Solaris, please check with your vendor to determine if this product is available on Linux.

Table 6-1 Open Source Version Control Software Availability

Solaris SCCS CVS RCS Subversion Linux CSSC A v a i l a b l e Available Available Contact

cssc.sourceforge.net www.gnu.org/software/cvs www.gnu.org/software/rcs subversion.tigris.org

The open source movement has produced many additional open source version control products, but Concurrent Versions System (CVS) is the most widely used.

43

The following subsections describe an open source alternative to Sun's SCCS implementation called CSSC, then describe and compare RCS, CVS, and Subversion which are more widely used on Linux.

6.1.1 CSSC

A SCM tool to consider for a Solaris to Linux migration is the Compatibly Stupid Source Control (CSSC) utility. CSSC is the GNU Project's replacement for SCCS. The purpose behind CSSC is to provide an open source utility with similar functionality to SCCS. However, this software is not yet complete and is not recommended as a full-fledged replacement for SCCS unless your project really needs SCCS and cannot use any of the alternatives like RCS. If this is the case, use CSSC but do plan to update to a more modern version control product in the future. Refer to cssc.sourceforge.net for more information on CSSC.

6.1.2 RCS

The Revision Control System (RCS) is supported by the Free Software Foundation. You can find information relating to RCS and its usage at www.gnu.org/software/rcs. RCS and SCCS are functionally similar, with the following notable exceptions: SCCS does not support binary files, whereas RCS does. Many consider RCS to be easier to use for first-time users. There are fewer commands, it is more intuitive and consistent, and it provides more useful arguments. Branches have to be specifically created in SCCS. In RCS, they are checked in as any other version.

One of the advantages of using RCS instead of SCCS is that RCS allows you to tag files with a set name. This set can then be referenced as a group using the set name. SCCS has no comparable feature and is meant primarily for keeping the revision history of individual files. RCS is normally included in most Linux distributions. To help you migrate from SCCS to RCS or CVS, a set of scripts are included with the CVS sources at www.cvshome.org, or from sccs2cvs.cvshome.org.

44

6.1.3 CVS

The Concurrent Versions System (CVS) differs from RCS and SCCS in its flexibility to operate in an environment with remote developers. In addition to the basic functionality of RCS and SCCS, CVS adds functionality that: Provides grouping functions that allow developers to treat collections of files as a single object called modules. Supports the client/server model of communication. Teams can share the same CVS repository as long as a network connection is present. These connections can be tunneled over a secure shell (ssh) connection. Supports unreserved checkouts, which enable multiple developers to work on the same file simultaneously and allows for merging of the modified files during checkin. Migration from RCS to CVS is a relatively painless process because CVS was originally built on top of RCS and still uses the RCS file format as its own. This enables you to copy RCS files directly into a CVS repository. In converting from RCS to CVS, make sure that you follow these steps: 1. Create a CVS repository. Within this repository, create an empty directory structure to hold the RCS files. Keep a backup copy of your RCS files in their original location. 2. Copy the files into the repository taking care to preserve file dates and other information. For example, use tar (1) or cp -rp if the files are local. 3. Update the modules file in CVSROOT. This can be updated using cvs co modules or cvs co CVSROOT, editing the modules file, and checking it back in again. Make sure that there is an entry in the modules file for every directory (top level and subdirectory) in the repository. 4. Test the change by attempting to check out (cvs co ... ) a few of the new files you have added, and running cvs log or cvs status on some of the files you have checked out. CVS is normally included in most Linux distributions. For more information regarding CVS, visit the project web site at

www.cvshome.org.

45

6.1.4 Subversion

Subversion is a relatively new entry into version control for the Linux platform and is steadily gaining popularity. Designed to supplant CVS, Subversion was written to fix many of the shortcomings of CVS, such as directory versioning and atomic commits. Subversion features the following additions to CVS: Subversion implements a versioned file system, which is a "virtual" file system that tracks changes to whole directory trees over time. Files and directories are versioned. · Subversion enables you to add, delete, copy, and rename both files and directories. Every newly added file begins with a fresh, clean history all its own. Because CVS is limited to file versioning, operations such as copies and renames, which might happen to files, but which are really changes to the contents of the containing directory, are not supported in CVS. Additionally, in CVS you cannot replace a versioned file with a new file of the same name without the new file inheriting the history of the old, perhaps completely unrelated file. Subversion supports atomic commits. This means a collection of modifications either goes into the repository completely, or not at all. This allows developers to construct and commit changes as logical chunks, and prevents problems that can occur when only a portion of a set of changes is successfully sent to the repository. Each file and directory has a set of properties--keys and their values-- associated with it. You can create and store any arbitrary key/value pairs you wish. Properties are versioned over time, just like file contents. Subversion has an abstracted notion of repository access, making it easy for people to implement new network mechanisms. Subversion can plug into the Apache HTTP Server as an extension module. This gives Subversion a big advantage in stability and interoperability, and instant access to existing features provided by that server--authentication, authorization, wire compression, and so on. A more lightweight, standalone Subversion server process is also available. This server uses a custom protocol, which can be easily tunneled over SSH.

46

Subversion expresses file differences using a binary differencing algorithm, which works identically on both text (human-readable) and binary (human-unreadable) files. Both types of files are compressed and stored in the repository, and differences are transmitted in both directions across the network. For more information on Subversion, visit the project web site at subversion.tigris.org.

6.1.5 Feature Comparison

Table 6-2 summarizes and compares the features of RCS, CVS, and Subversion. This information is applicable to these version control utilities on both Linux and Solaris.

47

Table 6-2 Feature Comparison of Common Version Control Systems

Feature SCCS/CSSC RCS CVS Subversion

Support for ASCII and binary files No Atomic Commits Support for copying or moving a file or directory to a different location while still retaining the history Remote repository replication No No

Yes No No

Yes No No

Yes Yes Yes

No

No

No

Limited (use third-party scripts)

Changes propagated to parent repositories Support for defining permissions to different parts of a remote repository Change set support Support for line-by-line file history Ability to work on only one directory of the repository System support for tracking uncommitted changes Support for pre-event triggers Support for post-event triggers Support for GUI interface Support for web interface License

No

No

No

Limited (use third-party scripts) Yes

No

Limited

Limited

No Yes No Yes No No No No

No Yes No Yes No No No No

No Yes Yes Yes No No Yes Yes GPL

Partial Yes Yes Yes Yes Yes Yes Yes GPL

Proprietary/GPL GPL

6.1.6 Porting Sources to Other Version Control Systems

As an aid to getting started using the three most common version control systems on Linux, some of the basic SCCS commands are listed with their Linux equivalents in Table 6-3.

48

Table 6-3 SCCS Command Comparison

Description SCCS sccs create file RCS ci file CVS cvs add file Subversion svn add file

Check in the file for the first time and create the history. Check out a file for reading. Check out a file for editing. Check in a file previously locked. Print history of file.

sccs get file sccs edit file sccs delta file sccs prs file sccsdiff ­rx sccs diffs file

co file

cvs ­r checkout file cvs checkout file cvs commit file cvs history file cvs diff -rx -ry file cvs diff file

svn checkout file svn checkout file svn commit file svnlook history file svn diff -rx: y file svn diff file

co ­l file

ci file

rlog file

Compare two versions. -ry file Compare current version with last revision. Merge changes between two versions of a file.

rcsdif -rx -ry file rcsdiff file

sccs edit -ix -y file

rcsmerge ­rx -y file

cvs update -jx -jy file

svn merge -r x:y file

Release a held lock on sccs unget file a file. Abandon work on the current file.

sccs unget file

rcs ­u file

cvs release file cvs unedit file

No equivalent

svn revert file

rcs ­u file

6.2 Build Tools

The Ant and Make utilities are two of the most common build tools in use on Solaris. These tools are also available on Linux. As noted in the following sections, the portability of builds based on these two tools can be very different.

6.2.1 Ant

Ant is a Java utility similar in function to the make(1) command. Ant is used by Sun One Studio, Eclipse, NetBeans, and other Java tools as well as some non-Java development systems. Ant uses an XML file that defaults to build.xml. This file is analogous to the make command's Makefile. Ant

49

was designed to be portable across systems, but it is extensible, so you must be careful when porting to other systems. For the most part, build. xml files should be portable. However, one common error that reduces portability is the use of system-specific paths in Ant build files. Some integrated development environments (IDEs), such as NetBeans, incorporate Ant such that there may be multiple versions of Ant installed. Other than space, this should not pose any problems.

6.2.2 Make

The make utility is a command generator that takes description files and some templates as input and produces a set of shell commands that are executed on behalf of the user. The description files are generally referred to as Makefiles. A Makefile normally contains a set of variable definitions followed by a set of rules. A rule contains a target list followed by a series of prerequisites or dependencies list. The target and dependency lists are delimited by a colon tab character sequence. A rule is followed by zero or more commands with initial tab characters. The templates define the default environment values such as SHELL and rules such as the suffix rules. There are several variants of the UNIX make(3) command. Solaris includes the System V make and the GNU make command as an option. Linux includes the GNU make command. Fortunately, the GNU make command incorporates all of the System V features. Many of the differences between the two make commands are listed in the following sections. Additionally, the ­p option causes make to print out its defaults. For more information on GNU make, refer to

www.gnu.org/software/make/manual/make.html.

6.2.2.1 Shell Issues

Many Makefiles in a complex build environment rely on the default shell. Many Solaris installations tend to use the C shell (/bin/csh) and most Linux installations standardize on the GNU Bourne Again SHell (/bin/bash). The make command uses the SHELL environment variable to determine which shell syntax it will parse. Fortunately you can change this on execution as follows:

$ make SHELL=/bin/csh

50

This overrides the default SHELL variable and allows make to process C shell syntax properly. The GNU make command and the bash shell are available on Solaris, so that Makefiles can be tested before the porting process begins. The C shell is supported on Linux systems.

6.2.2.2 The % Wildcard in String Substitution

Both the Solaris and GNU make commands support the use of the percent sign (%) wildcard character in the dependency and command portion of a rule. However, with GNU make, only the first % is replaced in the pattern. Subsequent % characters remain unchanged. The Solaris make command allows subsequent % characters to be replaced. Example 6-1 shows where three source files in the SOURCES variable are to be turned into three regular (.o) and three debug (.dbg) object files. Example 6-2 shows a corrected Makefile for the GNU make command.

Example 6-1 Pattern Substitution with the % Character

SOURCES=a.c b.c d.c OBJECTS=$(SOURCES:%.c=%.o %.dbg.o) all: @echo OBJECTS is $(OBJECTS)

Running Solaris make on Example 6-1 results in the variable OBJECTS having the following value:

a.o a.dbg.o b.o b.dbg.o d.o d.dbg.o

Running GNU make on Example 6-1 results in the variable OBJECTS having the following value:

a.o %.dbg.o b.o %.dbg.o d.o %.dbg.o

Note that the debug (.dbg) object files are not defined when using GNU make as they are when processed with Solaris make.

Example 6-2 Corrected Pattern Substitution for GNU make

SOURCES=a.c b.c d.c OBJECTS=$(SOURCES:%.c=%.o) OBJECTS+=$(SOURCES:%.c=%.dbg.o) all: @echo OBJECTS is $(OBJECTS)

Example 6-2 shows how to modify the Makefile so that the OBJECTS variable has the following value:

51

a.o b.o d.o a.dbg.o b.dbg.o d.dbg.o

6.2.2.3 The $< Macro

The $< macro is intended to be used only in suffix rules and the .DEFAULT rule. The suffix rules are a mapping of a single target to a single dependency. The $< macro is defined to reference this dependency. Example 6-3 shows a common suffix rule.

Example 6-3 Using $< in a Suffix Rule

.c.o : $(CC) $(CFLAGS) ­c $<

Both Solaris and GNU make define $< and allow it to be used in more than just suffix rules or .DEFAULT, but in different ways. Example 6-4 shows how the $< macro can use the target suffix rule to obtain the corresponding dependency in the Solaris make, but not in GNU make. With GNU make, the $< macro is empty. Example 6-5 shows how to change your Makefile so that it is more portable and will work with GNU make.

Example 6-4 Using $< in an Action Line

foo.o bar.o: cc -c $<

Running Solaris make on Example 6-4 results in the following output:

> make -f targets.mk cc -c foo.c cc c bar.c

Running GNU make on Example 6-4 results in the following output:

> make -f targets.mk cc -c cc: no input files

Solaris Makefiles with rules that have undefined dependencies as in Example 6-4 need to be modified to explicitly specify the target dependencies as shown in Example 6-5. This approach has the added advantage that these explicit rules only specify single targets, so that expansion of the macro is unambiguous.

52

Example 6-5 Using $< in GNU make Action Line

foo.o : foo.c cc -c $< bar.o : bar.c cc -c $<

Running Solaris and GNU make on Example 6-5 results in the following output:

> make -f targets.mk cc -c foo.c cc c bar.c

HP recommends that you also consider using the $? macro if appropriate instead of the $< macro to increase the portability of your Makefile.

6.2.2.4 Suffix Rules

There are some differences in the suffix rules defined in the Solaris and GNU make commands. See Appendix D for a comparison of these suffix rules.

6.2.2.5 Source Control Support

The Solaris make command includes a set of suffix rules supporting the SCCS source control system. These are rules that contain a tilde(~), such as .cc~. o. While SCCS is available on Linux, the GNU make command does not implement the SCCS suffix rules by default. Where a system is dependent on SCCS in the Makefiles, the suffix rules need to be added manually.

6.3 Integrated Development Environments (IDEs)

Various integrated development environments (IDEs) are available for Linux. Many of these are also available on Solaris. The KDE project (www.kde.org) provides KDevelop (www.kdevelop.org), a feature-rich IDE supporting multiple platforms and languages (C/C++, Java, Fortran, and others) with support for several version control systems, including CVS and ClearCase. The GNOME desktop (www.gnome.org) is the other major user interface on Linux. The GNOME project developed an IDE named Anjuta (anjuta.sourceforge.net). Anjuta supports multiple languages, platforms, and version control systems. Both KDevelop and Ajunta are at least partially integrated into their respective windowing environments.

53

The Eclipse IDE (www.eclipse.org) is an advanced plug-in based IDE that supports C, C++, and Java development on multiple platforms, including Solaris, Linux, and HP-UX. Additional information on IDEs is available in Chapter 6.

6.4 Debugging Tools

Sometimes debugging problems require additional functionality beyond that of standard IDE debuggers. Several standalone debuggers are available from the Linux community, many of which also support Solaris. The GNU debugger (GDB) (www.gnu.org/software/gdb) is the de facto debugger for Linux. Most other Linux debuggers are extensions to or developed from GDB. One of the most popular is the Data Display Debugger (DDD). This debugger provides a graphical front end for many debuggers, including GDB, XDB, and JDB. Refer to www.gnu.org/software/ddd/ for more information. Tracking system calls, memory initialization issues, or memory leaks requires specialized tools. On Solaris, you can use the truss(1) utility to trace system calls and create reports on how often and how they were called. For Linux, the strace (1) utility (www.sourceforge.net/projects/strace/) provides a similar function, enabling you to trace the system calls of another process or program. Several options exist, both commercial and open source, to track memory-related issues. For example, the IBM Rational PurifyPlus product is available for both the Solaris and some Linux platforms. Free alternatives also exist in the open source community for debugging memory-related issues. Two popular alternatives are the Electric Fence library (www.freshme at.net/projec ts/efenc e) and Valgrind (valgrind.kde.org). Electric Fence provides functionality similar to that of the watchmalloc (3) library on Solaris. Valgrind is a more aggressive utility that runs the application in an emulated environment, which allows it to monitor every memory access on both the heap and stack. These utilities are very different, but both are highly effective at locating memory management and overflow issues. Mudf lap, a modified version of the GCC compiler is also available. Mudflap provides pointer-checking technology based on compile-time instrumentation. Mudflap detects the standard memory errors, such as NULL pointer usage and buffer overruns, and it also detects memory leaks and the use of uninitialized data. Mudflap was developed as part of a new optimization

54

engine for the GNU compiler collection and has been merged into GCC 4.0. For more information, refer to gcc.gnu.org/projects/tree-ssa and the GCC 4.0 Release notes at gcc.gnu.org/gcc-4.0/changes.html. Many of these debugging tools, including GDB, DDD, and strace (1), are included with most Linux distributions.

6.5 Defect Tracking Tools

Part of tracking changes is tracking why things were changed. Defect tracking software is another key part of any development project. In the Linux development environment, you can choose from several such utilities. One of the most popular is Bugzilla (www.bugzilla.org). A creation of the Mozilla Web browser project (www.mozilla.org), it has become its own project and continues to grow in popularity and features. Another popular tool for tracking defects is GNU GNATS (www.gnu.org/software/gnats).

6.6 HP Solaris-to-Linux Application Transition Tools

This section describes the Solaris-to-Linux Application Transition Tools suite developed by HP. This suite of tools has three products which are targeted to help with different phases of the code transition cycle. Table 6-4 describes these transition tools.

Table 6-4 Solaris to Linux Application Transition Tool Summary

Transition Tool

HP Solaris-to-Linux binaryScan HP Solaris-to-Linux Software Transition Kit (STK) HP Solaris-to-Linux Porting Kit (SLPK)

Recommended Use Brief Description

Planning Scopes the porting effort by providing the number of APIs and their disposition on Linux. Provides a detailed assessment of the porting effort with specific recommendations at the library and API level. Provides a migration environment to effortlessly address the compatibility gap between the two platforms at the API and development tool levels.

Planning, Porting

Porting, Deployment

55

6.6.1 HP Solaris to Linux binaryScan Utility Overview

HP has developed the Solaris-to-Linux binaryScan utility to assist with the planning phase of an application port from Solaris 8 to Red Hat Enterprise Linux 3 or SUSE Linux ES 9. The binaryScan utility will help you quickly scope the porting effort without the need to access sources. This utility scans any dynamically linked executables on the Solaris operating system and produces a report that highlights the number and nature of compatibility issues with Linux. The database included in binaryScan for the Solaris-to-Linux transition covers the major Solaris libraries, including libc, libsocket, libthread, and libpthread. The Solaris-to-Linux binaryScan utility is available upon request only. To learn more about binaryScan, or to download this utility for other platform pairings, visit the Application Transition portal http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc 3f3515b49c108973a801/?ciid=ff81db5f7b835110VgnVCM10000027 5d6e10RCRD

6.6.2 HP Solaris-to-Linux Software Transition Kit Overview

HP has developed a software transition kit (STK) to assist with the transition of application source code from Solaris 8 to Red Hat Enterprise Linux or SUSE Linux Enterprise Server. The HP Solaris-to-Linux STK contains software tools to scan source code written in C or C++, as well as shell scripts and make files. The STK file scanners are used to create both summary or detailed report identifying potential porting issues and providing guidance on how to resolve them. The Solaris-to-Linux STK includes technical reference documents and more than 190 impact statements offering detailed porting information on many APIs. The Solaris-to-Linux STK is available upon request only. To find out how to obtain a kit, follow the directions at http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc 3f3515b49c108973a801/?ciid=ff81db5f7b835110VgnVCM10000027 5d6e10RCRD

56

To learn more about HP STKs, or to download STKs for other platform pairings, visit the HP STK portal at www.hp.com/go/STK.

6.6.3 HP Solaris-to-Linux Porting Kit (SLPK) Overview

The HP Solaris-to-Linux Porting Kit (SLPK) is a porting environment that helps automate the Solaris to Linux migration process. This porting aid reduces the time and effort it takes to get a Solaris application up and running on Linux. The key component of SLPK is the migration environment. This environment provides the following capabilities: Header files and APIs to address common platform differences Compiler and linker drivers to allow use of Solaris development environment options Support for Solaris makefiles on Linux

The SLPK Migration Environment identifies issues such as nonportable Solaris header files, system-level functionality, system APIs, development tools, and automatically handles many of the differences. SLPK aims to address most of these differences, with each new version increasing the level of automation. Where manual code change is needed, HP recommends the use of the Solaris-to-Linux Software Transition Kit (STK) to learn more about the type of code modifications required. The STK should also be used to understand when the SLPK Migration Environment can be utilized. The SLPK supports both 32-bit and 64-bit C and C++ code. The following platforms and toolsets are currently supported: Solaris versions: Solaris 8 and Solaris 9 Linux distributions: RHEL 2.1, RHEL 3, SLES 8, and SLES 9 Linux compilers: gcc 2.96, gcc 3.2.x, Intel® C++ 7.x, and Intel® C++ 8.x Linux platforms: IA-32, AMD64, Intel® EM64T, and Intel® Itanium®-based systems

Refer to www.hp.com/go/SLPK for the latest information. Follow the instructions on that page for details on how to obtain the SLPK product.

57

6.7 Shells

The shells on UNIX have evolved from Stephen Bourne's original shell through two different genres of shells. The first genre are derivatives of the Bourne shell, which includes the POSIX shell, the Korn shell (ksh), the Bourne Again SHell (bash), and a few others. In the second genre there is the C shell (csh), developed at the University of California, Berkeley, and the open source tcsh. The following sections address the differences between the versions of shells installed on Solaris and the equivalent shells installed on Linux. There are two perspectives when considering the shells, the first is from the interactive perspective, and the second is from the shell script perspective. Both of these perspectives are very important when migrating from Solaris to Linux. All of the shells available on Linux are available on Solaris, so that you can test shell scripts for portability before porting them.

6.7.1 Bourne Again SHell (bash)

The Bourne Again S Hell (bash) was developed by the Free Software Foundation's GNU project and is a complete implementation of the IEEE POSIX.2. It supports compatibility modes for both the Bourne shell and POSIX shell. The default mode is Bourne shell with extensions. The BASH shell includes such features as interactive command-line editing, job control, and other features that are included in the C shell (csh) and Korn shell (ksh). This is the default shell on most Linux systems. The BASH shell's default mode is Bourne with extensions. Use the --posix option to start the BASH shell in the POSIX mode. Define the POSIXLY CORRECT environment variable before BASH is started to have BASH start in the --pos ix mode before reading it's startup files. Since the BASH shell is available on Solaris, you can migrate your Bourne and POSIX shell scripts to BASH on Solaris and then migrate the BASH scripts to Linux. For more information on bash (1), refer to

www.gnu.org/software/bash/bash.html.

6.7.2 Bourne Shell (sh)

The default shell installed on most Solaris systems is the System V Bourne shell (/bin/sh). Solaris application scripts that use this shell should use the default mode of BASH which is the Bourne shell mode with extensions. This will provide a Linux shell with good compatibility, but there are some

58

differences. Other areas of portability problems can come from shell script usage of commands, like ps(1) because they behave differently in the Solaris System V environment than in a Linux environment. Additional portability issues may be encountered if your application uses the Solaris Job Control Shell (jsh). This shell is a Bourne shell, which includes additional job control features found in some of the more recent Linux shells.

6.7.3 POSIX Shell (sh)

Solaris includes a POSIX compliant shell in /usr/xpg4/bin/sh. This path does not typically exist on Linux. The default shell installed in most Linux systems is the BASH shell, which has been restricted to be largely conformant with the POSIX 1003.2 and 1003.2a specifications but with extensions. The /bin/sh Linux command is normally a symbolic link to /bin/bash. Section 6.4.1 provides additional information on BASH.

6.7.4 Korn Shell (ksh)

The Korn shell (ksh) was written by David Korn at Bell Laboratories in the mid1980s, and was upward compatible with the existing Bourne shell. The Korn shell incorporated many of the advanced features of the C shell. The baseline AT&T version of ksh is 11/1 6/88, which is what is used on Solaris. For more information on ksh, refer to www.kornshell.com. The Korn shell version installed on Linux is a public domain version Korn shell (ksh or pdksh). The /bin/pdksh command is normally a symbolic link to /bin/ksh. For more information on pdksh, refer to

web.cs.mun.ca/~michael/pdksh/.

6.7.5 C Shell (csh)

The C shell (csh) was written by Bill Joy at the University of California at Berkeley as a shell whose syntax is similar to the C programming language. Some of the features added by the C shell are job control, history, aliases, tilde substitution, and more customizable environment attributes. Because Joy is one of the founders of Sun Microsystems, the C shell tends to be the default user shell for many Solaris installations. While the C shell is a UNIX proprietary shell, an open source version of the C shell, called the Turbo C shell (tcsh), is available on Linux. Scripts written for the C shell are highly portable to tcsh.

59

6.7.6 Turbo C Shell (tcsh)

The Turbo C shell (tcsh) is a superset of the C shell originally distributed as patches against the Berkeley C shell. It is currently maintained by Christos Zoulas, who rewrote it as a free version. While it includes all of the features of the C shell, it adds many new features such as a command-line editor and interactive word completion. tcsh is available on Linux as well as Solaris. For more information on tcsh, refer to www.tcsh.org.

6.7.7 Z Shell (zsh)

The Z shell (zsh) is available on both Solaris and Linux. It incorporates features from bash, ksh, and csh. Scripts written for zsh on Solaris should be highly portable to Linux. The Z shell is not usually installed by default. For more information on zsh, refer to www.zsh.org.

6.7.8 Environment Limits

Most of the Solaris shells use command-line length limits well in excess of those defined by POSIX. The Linux command-line lengths may vary not only by vendor, but also by patch level. The default command-line length is usually sufficient for most developers, but can be changed it you have a need for an unusually long command line. These limits are normally set in the LINE _MAX and ARG _MAX limit macros. Use the getconf (P) utility to check the current value of these limits. Another problem in porting is the use of hard-coded paths in shell scripts. It is generally a good practice to use environment variables to set the systemspecific information.

60

7 System Libraries, APIs, and Run-Time Environment

This chapter describes some of the header files, structures, APIs, and behavioral differences that might require application source code changes. This chapter also provides some guidance on what porting options are available. Due to the number of Solaris APIs, a complete comparison of all APIs is beyond the scope of this guide. Generally, porting applications from the Solaris to the Linux operating system (OS) is a simple and straightforward task. Proof of this statement is the large number of applications that have completed this port successfully. The underlying reason that most ports go smoothly is that both Solaris and Linux were designed to conform to UNIX standards. However, it is important to note that Solaris is certified as being compliant with several UNIX standards; Linux is not, due to the cost of certifying. While Linux has not undergone formal certification, it is developed by people who know the standards and have used certified environments. As a result, you will most likely be able to use code developed to these standards with little or no change. HP recommends that you port your application to a Linux distribution that has a Linux Standard Base (LSB) certification. The LSB is a binary standard that draws heavily on a suite of existing standards, including POSIX. Porting to the LSB will not make the initial port from Solaris any easier, but it will enable you to more easily support a wider range of Linux distributions after the port is complete. For more information on the LSB, including certified Linux distributions, refer to www.linuxbase.org.

7.1 System Libraries and APIs

The system run-time libraries (libc, and so on) provide similar functionality on both Solaris and Linux. Both Solaris and Linux provide good POSIX compatibility. Applications developed based on these standards are highly portable. However, differences do exist. A key task during the port of an application from Solaris to Linux is to identify usage of nonstandard system APIs and libraries. This is particularly true of older applications that have never been ported. For example, Solaris provided a threads library that predates the

61

POSIX threads standard. If your application was developed using these nonstandard thread APIs, you have a portability problem to deal with. In general, the portability problems in your application can be classified as one or more of the following types: Library requirement differences Some Solaris APIs exist on Linux but are in different libraries. For example, exp2 (3M) is implemented in the libsunmath library on Solaris. This library does not exist on Linux, but exp2(3) is implemented in libm in Linux. Include file requirement differences Some Solaris APIs exist on Linux but require different include files. For example, on Solaris, getopt(3C) requires stdlib.h, but on Linux, unistd.h is required. Missing APIs and header files Some Solaris APIs and header files do not exist on Linux. These are mainly restricted to non-POSIX APIs and nonstandard header files. An example of this type of problem is the openat(2) API. It was added in Solaris 9, it is not defined in POSIX, and it does not exist on Linux. Some Stack check APIs are not available on Linux.Those APIs are mostly used in applications that are compiled with stack checking enabled. These are applications that manage their own stacks or attempt to detect their own stack overflows. API prototype and header file differences Some API prototypes and header files declarations exist on both Solaris and Linux, but are different. An example of this is the getdents(2) API, which takes a dirent structure argument on both Solaris and Linux, but the Linux version of this structure is fixed length instead of the variable length dirent structure on Solaris. API return type differences Some API prototypes specify return types that differ between Solaris and Linux. For example, the Solaris endhostent(3NSL) has a return type of int. On Linux, endhostent(3) has a return type of void. Another exmpales are readlink(2), getmntent(3) which have different return type on Solaris and Linux.

62

errno differences

Solaris and Linux share APIs that are functionally equivalent when they complete successfully, but can return different errno values on failure if the errno value is not sepcified by POSIX. For example, the Linux open(2) manpage states that, on failure, errno can return a value of ENOMEM. Solaris application code will not be designed to handle this value since this is not a documented errno value for open(2) on Solaris. This type of difference exists in both directions. For example, Solaris application code may exist to handle an EIO errno value after a call to open (2), but the Linux implementation of open(2) is not documented as setting an EIO value on failure. Also Linux getsockopt(2) has some missing errno like ENOMEM, ENOSR, ENOBUFS, EHOSTUNREACH, EADDRNOTAVAIL, EADDRINUSE Feature and semantic differences Some Linux API implementations simply provide different features or provide a feature with slight differences. For example, the Solaris implementation of fcntl (2) provides support for the F_DUP2FD, F_SHARE, and F _UNSHARE flags, but these flags are not supported on Linux. There are many strategies to identify these portability problems in your source code. For example, you can use the error and warning messages, from the compiler and linker, to identify type mismatches and undefined symbols on Linux. You can also use application code knowledge and careful code inspection coupled with detailed Linux API knowledge and manpage study to identify more of these Linux differences. Finally, you can use failure analysis and run-time debugging to identify many of the remaining portability problems, assuming you have a test suite that exposes these issues before your first application release to your customers. The rest of this chapter provides you with specific Linux API knowledge to better prepare you to complete your application port. But as stated at the beginning of this chapter, it is difficult to present a reasonably complete comparison of all Solaris and Linux APIs, due to the large number of Solaris APIs and the difficulty of mapping this large information base to your application implementation. As a consequence, HP recommends the use of

63

binary or source code scanning tools to automate this process. These tools are not perfect, but they provide a repeatable process with consistent results across all of your application modules. Because of the ease with which these tools can be applied to collect analysis data, you can use them early in the port planning process to assess the difficulty of the port, identify issues, and estimate staffing needs. See Section 6.6 for more information on the HP Solaris-to-Linux binaryScan utility and the HP Solaris-to-Linux Software Transition Kit (STK). The STK includes an API analyzer with detailed API impact statements that can be particularly useful for this task. There are also tools to provide assistance in making code changes and to provide Solaris compatible run-time environments on Linux systems to minimize required code changes if the porting team wishes. The HP Solaris-to-Linux Porting Kit (SLPK) includes a source code analyzer and a migration environment to address the compatibility gap between the two platforms at the API and development tool levels. See Section 6.6 for more information on the HP SLPK. HP recommends that a porting project consider modifying the application source code to use standards-compliant APIs and features before starting the port whenever possible. This improves the Solaris code base while reducing the effort required to port to the new Linux target platform. A common approach to this type of source code modification is to isolate the non-POSIX APIs in a macro set or porting library. The Solaris implementation can use either the non-portable Solaris APIs or the portable POSIX APIs and glue code. The new portable macro set or porting library can then be unit tested on Solaris for compatibility. Some porting projects also include the goal that the application memory model be modified. Typically this involves going from a 32-bit to a 64-bit memory model. You can perform this type of code base change either before or after the application port from Solaris to Linux since both operating systems support both the 32-bit and 64-bit development environments. See Chapter 1 0 for more information. Regardless of your porting methods and goals, HP strongly recommends that you enable all compiler and link warning options. Carefully consider any resulting build messages. Allowing your build tools to report warning and errors is vital on the Linux system, but frequently proves to be a good idea for your Solaris build as well.

64

7.2 Solaris Libraries that Are Not Available on Linux

Table 7-1 Provides a mapping of some Solaris libraries to their Linux equivalent. Note that this mapping does not provide an exact equivalent for all of the APIs included in these Solaris libraries. Applications that use the APIs from these Solaris libraries can require recoding and will require build changes. The Linux compiler and linker will help you identify these APIs and libraries. Careful review of the Solaris and Linux manpages for these APIs is recommended.

Table 7-1 Missing Solaris Libraries and Linux Replacements

Solaris Library

libcrypt.so libbsdmalloc.so libmalloc.so libmapmalloc.so libmtmalloc.so libsocket.so libm9x.so libsunmath.so libaio.so libposix4.so libmd5.so libthread.so

Linux Library

libc.so

libm.so librt.so libsmbclient.so

See Chapter 8 for information on the Solariscompatible threads library (SCTL).

There are a significant number of the Solaris libraries that do not exist on Linux. Porting applications that use APIs from these libraries will require recoding and build changes. The Linux compiler and linker will help you identify these porting issues. Table 7-2 lists some of these libraries.

65

Table 7-2 Solaris Libraries that Do Not Exist on Linux

liba5k.so lib300.so lib300s.so lib4014.so lib450.so libadm.so libami.so libbz2.so libc2.so libc2stubs.so libcfgadm.so libcmd.so libcpc.so libcrypt_i.so libdemangle.so libdevice.so libdevid.so libdevinfo.so libdhcpagent.so libdhcpsvc.so libdhcputil.so libdoor.so libexacct.so libExbridge.so libfn_p.so libfn_spf.so libg_fc.so libGLw12.so libinetutil.so libkcs.so libkstat.so libkvm.so liblayout.so liblcl.so liblm.so libmail.so libnls.so libnvpair.so libpctx.so libplot.so libprint.so libproc.so libproject.so librac.so librcm.so librtld_db.so libsecdb.so libsmartcard.so libsmedia.so libssagent.so libssasnmp.so libsx.so libsys.so libsysevent.so libtermlib.so libtnf.so libtnfctl.so libtnfprobe.so libUil.so libvolmgt.so libvt0.so libwsreg.so libxfn.so libxil.so libXol.so

7.3 Solaris Header Files that Are Not Available on Linux

The POSIX compatibility of both Solaris and Linux results in good compatibility of the header files associated with the POSIX APIs. Some nonPOSIX Solaris APIs may be declared in different header files on Linux. Some of these can be mapped as shown in Table 7-3. Table 7-3 Mapping Solaris Header Files to Linux Header Files Solaris Header File

floatingpoint . h sys/ieeefp.h rpc/rpcent . h sunmath.h widec.h euc.h fenv96.h

Linux Header File

math. H stdlib.h rpc/netdb . h math.h wchar.h Use math library

You should also expect to find some APIs that require inclusion of different header files. For example, on Solaris, the wctype (3C) and the isw* functions (such as iswalpha (3C)), all require that the wchar .h header file be included. For Linux, your application will need to be modified to include

66

the wctype.h include file for these functions.

7.4 Files and File Handles

Both Solaris and Linux provide good POSIX compliance in the area of file management. These APIs are implemented as system calls on both operating systems and include: creat(2), open(2), close(2) and dup(2) as well as the APIs for reading, writing, managing file status, file ownership, file permissions, file seeks, file locking, directory entries, and other similar APIs. The mount(2) API has significant differences between the Solaris and Linux implementations. These differences include different arguments and support for fewer flags on Linux. You will need to recode your application to accommodate this difference. Refer to the mount(2) manpages for more information. The directio (3C) API is available on Solaris, but does not exist on Linux. HP recommends that you recode your application to use fcntl(2) with the O_DIRECT flag to achieve the same functionality in a more portable way. The llseek(2) API is available on Solaris, but does not exist on Linux. HP recommends that you recode your application to use the lseek64(2) function on Linux. The Solaris version of the flock structure includes an l_sysid member to better support clustering. This structure member is not present on Linux.

7.5 Networking

The Linux networking stack is one of the fastest in the industry and has supported both IPv4 and IPv6 for years. The default Linux networking stack provides BSD sockets. For applications requiring STREAMS, a package is available at www.gcom.com/home/linux/lis/. Linux also incorporates firewall and routing capabilities in the kernel. You can also configure it to support other transport protocols, such as IPX/SPX (Novell), DDP (Apple), and DECnet (Pathworks) protocols. Those experienced with Solaris security and file serving will find the same methods supported on Linux. Classic UNIX file-sharing technologies, such as NFS (Version 2 and Version 3), are fully supported. Linux also supports direct mounting for a variety of other network file-sharing technologies, including SMB (MS Windows) file sharing, allowing sharing of files to both

67

UNIX and Windows hosts from a single source. As on Solaris, security authentication is handled using Pluggable Authentication Modules (PAM) on Linux. Standard installations provide modules allowing authentication using NIS/NIS+, LDAP, and others.

7.6 Math Library

The basic functions of the math libraries on Solaris and Linux are similar. This section details math library functions common to both Solaris and Linux that comply with existing standards. It also highlights areas where portability issues exist.

7.6.1 Intrinsic Math Type Comparison

As specified by the IEEE Standard for Binary Floating-Point Arithmetic, IEEE Std 754-1985, Solaris and Linux 32-bit and 64-bit systems define the size of the float to be 32 bits and double to be 64 bits. The long double type is not specified by the IEEE standard. This is not a problem when porting a 32-bit or 64-bit application from Solaris to 64-bit Linux, because long double is 1 28 bits on both Solaris and 64-bit Linux systems. However the long double type is 96 bits on 32-bit Linux systems, so there is a loss of precision with this nonstandard type.

7.6.2 IEEE and Fast Math

Both Solaris and Linux systems comply with the IEEE Std 754-1985. This defines the bit patterns and behavior of floating-point numbers and arithmetic. Some of the behavior, such as denormals and infinities, result in lower than desired performance. The Solaris ­fast and the GCC -ffast_math compiler switches cause a relaxation of the standard, resulting in a possibly significant increase in performance. Many times, the minute loss of accuracy is an acceptable tradeoff.

7.6.3 ISO C Math Library Functions

Both Solaris and Linux implement the C99 10 standard, which includes the complex and imaginary floating-point types and arithmetic. To use these features with Sun Studio 9 and previous releases, you are required to supply the -lcplxsupp compiler flag and the libcplxsupp . a library. GCC on Linux does not require any special support other than including complex. h. Linking with the standard math library using the ­lm compiler

68

flag is usually necessary on both systems.

7.6.4 POSIX Math Library Functions

Both Solaris and Linux implement the IEEE Std 1003.1-2001 (POSIX.1) standard math functions. This support includes math functions such as the Bessel functions, that are not included in C9910.

7.6.5 Solaris Math Library Extension

The Solaris math library includes conversion functions for ASCII-encoded decimal numbers. This functionality is not available in the Linux math library. Some of the Solaris math library functions prior to Solaris 10 were included in libsunmath.so. Solaris also supplies a vector math library, libmvec.a. The vector library functions are not available on most Linux distributions, but are available from third parties. See Section 7.6.6 for more information.

7.6.6 Proprietary Math Libraries

The following proprietary math libraries are available on Linux: The HP Vector Math Library for Linux is a high-performance and highly accurate vector math library. It was developed jointly by CERN and Hewlett-Packard Laboratories and is available for Intel® Itanium®-based products as a free download. For more information, refer to www.hp.com/go/linuxdev/and follow the Linux software link under Linux Resources, or search for the HP Vector Math Library for Linux. Intel® Math Kernel Library is a set of highly optimized, thread-safe math routines designed for engineering, science, and financial applications that require maximum performance on Intel® processors. The latest library is available from Intel® at www.intel.com/software/products/mkl/. The functional areas of the library include: Linear Algebra (BLAS and LAPACK) Linear Algebra (PARDISO Sparse Solver) Discrete Fourier Transforms Vector Math Library Vector Statistical Library (random number generators)

69

The Intel® Cluster Math Kernel Library is similar to the Math Kernel Library except that it supports clusters. The latest version is available from Intel® at www.intel.com/software/products/clustermkl/. MATLAB®, from The Mathworks, Inc., is a popular commercial math language system that is compatible with C, C++, FORTRAN, and Java. The latest version of MATLAB is available from Mathworks at www.mathworks.com/products/matlab/ and it supports both 32-bit and 64-bit Linux.

7.6.7 Other Considerations

Most applications using the standard math library functions should port cleanly with respect to the math libraries. Be careful when using some math optimizations because the behaviors may differ between compilers. In addition to the standard math library supplied with Linux and those mentioned in the previous section, the Intel® C++ compiler suite also provides a replacement for the standard libraries that have been highly optimized for the target Intel® platforms. Both GCC and the Intel® C++ compiler suite have a number of specific floating-point optimizations.

7.7 Developing Libraries

Both Solaris and Linux provide methods of versioning dynamic shared objects (DSOs), also known as shared libraries. External versioning involves creating multiple libraries with different names, usually the same name with a version string embedded. Generally the library name without the version string is also created as a reference to the latest release. While effective, this method can lead to a large number of libraries that are mostly identical. As a result, the concept of libraries with internal version strings was created, allowing you to support multiple revisions of a library with a single library. Linux has taken this concept further and allows versioning on a per-API level. On Linux, libraries that contain only minor changes and are mostly compatible are generally versioned using internal versioning. When major changes happen, such as with a major version release (V1 to V2), external versioning is also used. For example, libc on Linux uses both internal and external versioning. Current libc releases use a file name of libc.so.6 on x86 systems, indicating which major release this library supports. The

70

internal versions are indicated by the string GLIBC followed by a number sequence delimited by periods (.). The first number indicates the major release, the second number indicates a minor release, and any further numbers indicate patches. For example, GLIBC_2 . 3 . 2 indicates the second patch level of the third minor release of the second major version of GLIBC. A good paper covering the history and process of developing shared libraries (DSOs) on Linux is available from people.redhat.com/drepper/dsohowto.pdf.

7.8 Security APIs

Linux system security is based on the DCE-RFC 86 Pluggable Authentication Modules (PAM) specification from SunSoft. Since this is the same open standard that Solaris PAM is based on, Linux provides good compatibility with the Solaris implementation. Both Solaris and Linux use PAM authentication to support local DES and MD5 encrypted password files. They also use PAM to support NIS and NIS+ services. The NIS services are also known as yellow pages or YP services. For more information, refer to www.kernel.org/pub/linux/libs/pam/. File access control lists (ACL) implement a more fine-grained permission model then is provided with the standard UNIX file owner, group, and others access permissions. File ACLs allow additional users and groups to be granted or denied access. File ACLs are available by default in the 2.6 and later Linux kernels and are available as a patch for 2.4 kernels from acl.bestbits.at. The 2.4 kernels in RHEL 3.0 and SLES 8.0 include this patch.

71

8 Threads

Solaris supports two distinct thread packages: Solaris Threads. This is a proprietary thread library. It maps in part to similar Linux interfaces. As an alternative, an open source Solariscompatible thread library is available for Linux. POSIX Threads. These are highly portable across platforms.

There are a number of threads packages for Linux, including two POSIX and a Solaris threads-compatible library implementations. The following POSIX threads packages are routinely available on Linux: LinuxThreads is the older model which has been available for the 2.0.0 and more recent kernels. Native POSIX Thread Library (NPTL) is the new model, which is available with the 2.6 kernel. HP recommends using NPTL when practical.

The Solaris-compatible Threads Library (SCTL) was developed by HP to assist in migrating applications from Solaris to Linux. It is also available for other platforms. For more information on the Solaris threads library for Linux, refer to www.sourceforge.net/projects/sctl. Additional thread packages, including LWP and DCE implementations, are also available for Linux. Information on these packages is available at the Linux Documentation Project. Refer to www.tldp.org/FAQ/ThreadsFAQ/.

8.1 Solaris Thread Models

Most threaded Solaris applications will use one of the following two thread models. The portability of these models varies greatly and is described in this section.

8.1.1 Solaris Threads

Solaris Threads is a proprietary, non-POSIX thread library that uses the library libthread.so. As described in Section 8.3, many of the Solaris thread APIs map to NPTL APIs. Other options are to convert the application to use POSIX threads for portability, or to use the SCTL to support the code with minimal modifications on Linux.

72

Solaris utilized a two-level threading model up to Solaris 8. In this release, Sun included an alternate, 1-on-1 libthread implementation called LWP. All future versions of Solaris use the LWP library to provide Solaris Threads.

8.1.1.1 Lightweight Processes (LWP)

Lightweight Processes (LWP) are a proprietary thread implementation on Solaris and do not comply with the POSIX 1003.1 c standard. Applications using this thread model are not portable and must be converted to POSIX threads for portability. This port can be implemented on Solaris to prepare the application sources, before starting your application port. Alternatively, this thread work can be done on Linux as part of the application port.

8.1.2 Solaris POSIX Threads

Solaris supports POSIX 1003.1c threads using the libpthread.so, which itself is layered on Solaris Threads. Solaris programs using the POSIX interfaces should port to Linux cleanly.

8.2 Linux POSIX Threads Models

The two Linux threads models both implement the POSIX 1003.1 c API. The following sections discusses the two models in more detail.

8.2.1 LinuxThreads

LinuxThreads was developed in 1996 as a kernel threads model, where one Linux thread is a single Linux process. This is called the 1-on-1 threading model. Context switches between threads are performed by the Linux kernel. One advantage of this model is that it can take advantage of multiple processors. LinuxThreads implements the POSIX 1003.1 base functionality except for signal handling and some of the optional functionality. This is the most common Linux threads package on the 2.4 kernel. It is present in GNU libc version 2 and is provided as part of all current distributions. While similar to the NPTL implementation, it has a number of differences. For more information on the classic Linux threads package, refer to pauillac.inria.fr/~xleroy/linuxthreads/.

8.2.2 Native POSIX Thread Library for Linux (NPTL)

The Native POSIX Thread Library (NPTL) is a replacement for the older LinuxThreads and is POSIX compliant. Support for the NPTL was

73

developed in the 2.6 kernel and the NPTL APIs are included in GNU libc. Like LinuxThreads, NPTL is implemented as a 1-on-1 model, but the new kernel changes provide a significant performance gain. The NPTL also provides per-process signals and enhanced scalability on NUMA architectures. While the NPTL was developed with the 2.6 kernel, it has been back ported to some distributions of the 2.4 kernel. A design white paper providing some useful insight into this new threads implementation is available from http://www.redhat.com/whitepapers/developer/POSIX_Linux_Threading.pd f.

8.3 Mapping Solaris Threads to the Linux NPTL APIs

Table 8-1 compares the Solaris Threads and the NPTL threads interfaces. Additional information on this mapping is available at www.redhat.com/docs/wp/solaris port/c1347.html.

Table 8-1 Comparison of Solaris Threads to Linux NPTL

Solaris Threads

thr_create thr_create (daemon)

Linux NPTL

pthread_create

Creates a daemon thread that does not exit when the process exits

thr _min_stack

No equivalent

PTHREAD _STACK _MIN

Returns the minimum allowable stack for a This is a constant defined by POSIX threads thread

thr _exit thr_join thr _self thr_main pthread_exit pthread_join pthread _self

Returns true if calling thread is the main thread

thr_continue

No equivalent

No equivalent

pthread_getspecific

pthread_key_create pthread_setspecific

thr_getspecific

thr_keycreate thr_setspecific

74

Solaris Threads

thr_suspend thr_yield thr_getconcurrency thr_setconcurrency thr_getprio thr_setprio mutex_init mutex _destroy mutex _lock mutex_trylock mutex _unlock cond_init cond_wait cond_timedwait cond _signal cond_broadcast cond _destroy rwlock_init rw_rdlock rw_tryrdlock rw_wrlock rw_trywrlock rw_unlock rwlock _destroy sema_init sema _wait sema_trywait sema_post sema _destroy thr_sigsetmask thr _kill Sigaction Kill Sigwait

Linux NPTL

No equivalent

sched_yield pthread_getconcurrency pthread_setconcurrency pthread_getschedparam pthread_setschedparam pthread_mutex_init pthread _mutex _destroy pthread _mutex _lock pthread_mutex_trylock pthread _mutex _unlock pthread_cond_init pthread_cond_wait pthread_cond_timedwait pthread _cond _signal pthread_cond_broadcast pthread _cond _destroy pthread_rwlock_init pthread_rwlock_rdlock pthread_rwlock_tryrdlock pthread_rwlock_wrlock pthread_rwlock_trywrlock pthread _rwlock _unlock pthread _rwlock _destroy sem_init sem _wait sem_trywait sem_post sem _destroy pthread_sigmask pthread _kill Sigaction Kill Sigwait

75

8.4 Additional Information on LinuxThread Implementations

This section provides additional information on the NPTL and LinuxThreads implementations on Linux.

8.4.1 Nonportable Solaris POSIX Thread Interfaces

Both Solaris and Linux implement POSIX 1003.1c. As a consequence, Solaris code that uses the POSIX thread APIs should port cleanly to Linux. However, as indicated by their "_np" API suffix, the following Solaris POSIX thread interfaces are nonportable and are not available on Linux. If your application source code uses any of these routines, some amount of recoding will be required.

pthread_cond_reltimedwait_np () pthread_mutexattr_getrobust_np () pthread_mutexattr_setrobust_np () pthread_mutex_consistent_np () pthread_mutex_reltimedlock_np () pthread_rwlock_reltimedrdlock_np () pthread_rwlock_reltimedwrlock_np ()

In addition, the following thread functions are part of the optional POSIX thread APIs. Solaris pthreads provides the optional implementation and the Linux implementations do not.

_pthread_mutexattr_getpriorceiling() _pthread_mutexattr_setpriorceiling() pthread_mutexattr_getprotocol () pthread_mutexattr_setprotocol () pthread_mutex_getpriorceiling () pthread_mutex_setpriorceiling()

8.4.2 POSIX Thread Attribute Default Values

Table 8-2 lists the POSIX thread attributes and their default values for Solaris, NPTL, and LinuxThreads. These are defined in the POSIX standard and in the NPTL and LinuxThreads header files. Refer to pthread_attr_init (3) on both Linux and Solaris for more information.

76

Table 8-2 Default POSIX Attribute Values on Solaris and Linux

Solaris Threads Attribute [Default Value]

contentionscope [PTHREAD _SCOPE _PROCESS] detachstate [PTHREAD_CREATE_JO INABLE] guardsize [PAGES IZE] inheritsched [PTHREAD_EXPLICIT_SCHED] policy [SCHED _OTHER] priority [0] stackaddr NULL Allocated by the system stacksize NULL Set as a Solaris tunable parameter, 1 or 2 MB

LinuxThreads and NPTL Attribute [Default Value]

Scope [PTHREAD _SCOPE _SYSTEM] Detachstate [PTHREAD_CREATE_JO INABLE] Not documented, silently ignored

inheritsched [PTHREAD_EXPLICIT_SCHED] schedpolicy [SCHED _OTHER] schedparam [0] stackaddr [Not Aplicable]

stacksize [Platform specifi c11]

8.4.3 POSIX Threads Header Files

The NPTL header file is available at /usr/include/nptl/pthread.h on most Linux systems. The LinuxThreads header file is generally available at

/usr/include/pthread.h. You can use the compiler ­I<path> option

to specify the appropriate include path when building an application.

8.4.4 POSIX Threads Shared Libraries

The NPTL library is generally placed in the /usr/lib/nptl directory. The -L<path> linker option is the preferred way to access this library.

11

Affected by process limit if set.

77

8.4.5 Compiling a POSIX Threads Application

Use the ­pthread compiler flag when using the GNU GCC compiler. Use the CFLAGS and LDFLAGS make and environment variables to specify the appropriate header files and libraries at build time. This helps to avoid problems caused by referencing the components of the wrong threads package. 8.4.5.1 Compiling Using LinuxThreads The following example shows how to compile a simple LinuxThreads application:

gcc ­pthread testthread.c ­o testthread

8.4.5.2 Compiling Using the Native POSIX Thread Library The following example shows how to compile a simple threaded application using NPTL:

gcc ­pthread testthread.c ­o testthread ­I/usr/include/nptl -L/usr/lib/nptl

78

9 Endian Considerations

This chapter discusses the issues involved when migrating code and data from Sun SPARC systems running Solaris to Linux on HP Integrity or ProLiant servers.

9.1 Overview

A potentially significant problem in porting applications is that the Sun SPARC systems and the HP Integrity or ProLiant servers running Linux have different endian models. Endianism refers to the way in which data is stored, and defines how bytes are addressed in integral and floating-point data types. Linux on HP Integrity and ProLiant servers is implemented as little endian, which means that the least significant byte is stored at the lowest memory address and the most significant byte is stored at the highest memory address. The Sun SPARC platforms are big endian 12, which means that the most significant byte is stored at the lowest memory address and the least significant byte is stored at the highest memory address. The following is an example of the layout of a 64-bit long integer:

Low Address Little Endian Big Endian Byte 0 Byte 7 Byte 1 Byte 6 Byte 2 Byte 5 Byte 3 Byte 4 Byte 4 Byte 3 Byte 5 Byte 2 High Address Byte 6 Byte 1 Byte 7 Byte 0

Table 9-1: Bytes are numbered from least to most significant For example, a 32-bit integer with a value of 1 025 looks significantly different when examined as an array of four bytes:

Little Endian

char*[ 0] = 0x01 char*[ 1] = 0x04 char*[ 2] = 0x00 char*[ 3] = 0x00

Big Endian

char*[ 0] = 0x00 char*[ 1] = 0x00 char*[ 2] = 0x04 char*[ 3] = 0x01

12

The Sun Solaris Operating System x86 Platform Edition is little endian.

79

Endian issues most often come into play during porting when bit masks are used or when indirect pointers address portions of an object. The C and C++ languages implement bit fields that help to deal with endian issues. HP recommends the use of bit fields rather than mask fields. One area where endianism is important to understand is when systems of different endian orders must interchange data by file or network. Storing integral or floating-point data in binary format preserves the endianism of the system that stored the data. Similarly, sending binary integral or floating-point data to other systems also preserves the endianism of the sending system. When interchanging integral or floating-point data between systems, whether storing the data on a shared storage medium or sending it via data communications, convert the data to a common form used by the sharing systems. The IP networking standards specify that network packet headers use network byte order. Network byte order is a canonical form of byte ordering that is platform-independent and is defined as big endian. This is in contrast with host byte order. Host byte order is platform-specific, and can be either big endian or little endian. Several functions are used to convert 16-bit and 32-bit integers from host byte order to net byte order; htonl (3) and ntohl (3) are used to convert 32-bit integers, and htons (3) and ntohs (3) are used for 1 6-bit integers. There is no standard set of functions for 64-bit, but Linux does provide the bswap _16, bswap_32, and bswap_64 macros. These functions and macros exist on both bigendian and little-endian systems.

9.2 Persistent Data

Consider byte order carefully when reading or writing data. Multibyte values should be preprocessed such that the endian type of the source and the destination is unimportant. Make note of the following code example, which assumes that the system implementing the same endianism will be used for both writing and reading data: Writer Code:

#include <unistd.h> #include <inttypes .h> int64_t val = 1; ssize_t result = write( fileDes, &val, sizeof(val) );

80

Reader Code:

#include <unistd.h> #include <inttypes .h> int64_t valRead; ssize_t result = read( fileDes, &valRead, sizeof(valRead) );

When both the reader and writer systems are of the same endian type, the contents of valRead will be 1. However, in situations in which the reader and the writer have different byte ordering, valRead will be 0x0 10 00 00 00 00 00 00 0. Applications that store persistent data in native endian format need to be redesigned to avoid endian issues in shared or migrated data sources. Conversion of old data is best handled by a separate process, which allows the primary application to remain focused on endian-neutral development. While the previous example uses an integral datum, floating-point data is also endian-sensitive. The following are several methods of handling data storage in endianneutral formats: Store the data in a defined endian format. Add additional data to indicate format. Store all data as ASCII strings.

Storing data in a defined endian format is the preferred method because it requires the least overhead. Do this by developing endian-neutral I/O functions. You can develop I/O functions by: Using compile-time controls Using run-time controls Using standardized data format functions (by using htons () and similar functions).

The following sections describe each of these methods.

9.2.1 Preprocessor Controlled Byte Order

While you can use the preprocessor to control functions that need to be implemented differently based on endianism, it is not standardized. None of the current standards require that the compilers provide a means of

81

determining the endianism. However that does not mean that it cannot be done. To do this, you must implement your own means of determining the endian type of a platform. Linux provides the endian.h header file that defines BYTE_ORDER. Example 9-1 provides one example of how you might develop code to handle this situation. Example 9-1 Supporting Multiple Byte Ordering Using the Preprocessor

#if defined(__linux) #include <endian.h> #endif #if !defined (BIG _ENDIAN) #define BIG _ENDIAN 4321 #endif #if !defined (LITTLE _ENDIAN) #define LITTLE _ENDIAN 1234 #endif #if !defined (BYTE_ORDER) #if defined(__hpux) || defined(__sparc) #define BYTE_ORDER BIG _ENDIAN #elif defined(__osf__) || defined (__linux) #define BYTE_ORDER LITTLE _ENDIAN #endif /* BYTE_ORDER */ #if BYTE_ORDER == BIG _ENDIAN /* some code that depends on big-endian byte ordering */ #else /* some code that depends on little-endian byte ordering */ #endif

9.2.2 Run-Time Byte Order Control

Another means of developing endian-aware code is to dynamically test for the system endian type at run time. This can be done by taking advantage of what is normally an endian bug in software. Use the routine in Example 9-2 to check the endian order. This will enable your code to determine if it is running on a little-endian or big-endian system, and to dynamically handle either little-endian or big-endian data.

82

Example 9-2 Testing Byte Order

#include <inttypes .h> boo l Tes tB ig End ia n (vo id ) { int16_t one = 1; char *cp = (char)&one; if ( *cp == 0 ) { return true; } return false; }

9.2.3 Using Standard Byte Order APIs

Using standardized endian-related APIs ensures that your code is portable. One such set of APIs is the host-to-network family. By storing your application data using these APIs, you ensure that the data is stored in big-endian (network) byte order and is therefore more portable than an endian-native format. The host-to-network/network-to-host byte order conversion functions (htons(3), htonl(3), ntohs(3), and ntohl(3)) are highly optimized and should reduce to only a few instructions on little-endian systems. Example 9-3 uses the htons() function to convert a 1 6-bit integer from host byte order to network byte order. Example 9-3 Using the htons(3) Routine on a 16-Bit Integer

#include <inttypes .h> int16_t w = 0x1234; printf ("Host Order w=%04x\n", w); printf ("Network Order w=%04x\n", htons (w));

Example 9-4 uses the htonl() function to convert an unsigned 32-bit integer from host byte order to network byte order. Example 9-4 Using the htonl(3) Routine on a 32-Bit Integer

#include <inttypes .h> int32_t w = 0x12345678; printf ("Host Order w=%08x\n", w); printf ("Network Order w=%08x\n", htonl (w));

One problem with the host-to-network APIs is that they only manipulate 1 6-bit and 32-bit data elements. Linux provides a set of byte swap macros: bswap_16, bswap_32, and bswap_64. These functions and macros exist on both big-endian and little-endian systems. The byteswap.h header file contains these macros. Example 9-5 shows the use of the bswap_64 macro and may be tested both on a 32-bit or 64-bit Linux system.

83

Example 9-5 64-Bit Host to Network Using the bswap_64() Macro

#include <stdio.h> #include <inttypes .h> #include <byteswap.h> uint64_t w = UINT64_C(0x123456789abcdef); printf ("Host Order w = %lx\n", w); printf ("Network order w = %lx\n", bswap_64(w));

9.3 Byte Swapping

To transfer data with multibyte values between little-endian and big-endian systems, you must provide code that swaps the byte order. For network and socket communications, use the host-to-network APIs (htons(3), htonl(3), ntohs(3) and ntonl(3)). This can be successfully accomplished only if you have detailed knowledge of the data structure layout and the format of your data within the data structure. With this knowledge, you can determine the correct manner in which you reorder your data. For example, character strings typically do not get swapped, 64-bit elements get swapped 8 bytes end-forend, and 32-bit elements get swapped 4 bytes end-for-end. In general, you need to know for each data type how it is represented at the source (network, on disk, and so on) and how this data is represented at the destination. The following types of I/O are transparent to endianism and do not need swapping: Data that is a buffer, which is really an array of bytes, because you want the byte at the lowest address to stay at the lowest address. Data that is read and written by the C++ framework classes. The framework classes always store data on disk in a type-tagged, compressed binary format. The processing is such that either machine type can read or write data. The on-disk format is the same for both types of machines. Temporary files that are written then read by the same invocation of an application and that are deleted before the application terminates.

Another way to swap bytes is to use the swab(3) (swap bytes) API. The prototype is:

void swab ( const void *src, void *dest, ssize_t nbytes);

84

You can also define a preprocessor macro. For 32-bit data, the code to convert little-endian to big-endian data might look as shown in Example 9-6. Example 9-6 32-Bit Endian Byte Swap Macro

#include <inttypes.h> #define SWAP_4_MACRO(value) \ ((((value) & UINT32_C(0x000000FF)) (( (value) & UINT32_C(0x0000FF00)) (( (value) & UINT32_C(0x00FF0000)) (( (value) & UINT32_C(0xFF000000))

<< << >> >>

24)| \ 8) | \ 8) | \ 24))

The macro to swap the byte order of 64-bit data is shown in Example 9-7. Example 9-7 64-Bit Endian Byte Swap Macro

#define SWAP_8_MACRO(value) \ ((((value) & 0xff00000000000000ul) >> 56) | (((value) & 0x00ff000000000000ul) >> 40) | (((value) & 0x0000ff0000000000ul) >> 24) | (((value) & 0x000000ff00000000ul) >> 8) | (((value) & 0x00000000ff000000ul) << 8) | (((value) & 0x0000000000ff0000ul) << 24) | (((value) & 0x000000000000ff00ul) << 40) | (((value) & 0x00000000000000fful) << 56)) \ \ \ \ \ \ \

Byte swapping may be necessary when moving data between different architectures. However, it does not have to greatly affect the performance of your code. The code required to swap a 2-, 4-, or 8-byte value is just a few instructions and is easily done entirely in the registers. If you have significant data to swap, such as large arrays, all of the code should fit in a small loop that fits well in the cache, and the data can be fetched sequentially from the data cache, which is very efficient. Just be sure to understand the format of your data before migrating your code, and you will not have any problems with data integrity.

9.4 Floating-Point Data

Floating-point data is also affected by endianism. For example, a float with a value of 1.0 is stored as the following integer value on a littleendian system:

0x0000803f

On a big-endian system, the same value is stored in memory as:

0x3f 800000

One possible method to convert floating-point values from a little-endian

85

system to a big-endian system is shown in Example 9-8. Example 9-8 Swapping Float Values

#include <inttypes .h> union int_float { int32_t int_val; float flt_val; }; union int _float my_var, swapped_var; swapped_var. int_val = htonl (my_var. int_val);

9.5 Unused Bytes

Sometimes code that tries to make efficient use of memory takes advantage of the fact that often not all 4 bytes in a 32-bit integer are used. For example, if a particular int field in a record will hold only values in the range of 0 to 1 0,000,000, the most significant byte will always be 0. Rather than adding another element to a structure, the free byte is often used to store an element requiring only 1 byte of storage. If the most significant byte is accessed by means of a character array or by casting and dereferencing a pointer, then the code will not be portable, and slightly different versions will be needed on big-endian and little-endian machines. Example 9-9 shows a sample of this direct byte access. Example 9-9 Direct Byte Access

#include <inttypes .h> typedef union freebyte { int32_t intdata; char chardata[ 4]; } mystruct; #define set _int(s, x) \ (s).intdata = (((s).intdata & 0xFF000000) | (x&0x00FFFFFF)) #define get_int(s) ((s) .intdata & 0x00FFFFFF) /* The char accessors are endian specific! */ #define set _char(s, x) (s) .chardata[ 3] = x #define get _char(s, x) (s) .chardata[ 3]

However, implementing it as a named bit field will enable the compiler to generate correct instructions regardless of the endianism. Thus, the direct byte access problem can be corrected as shown in Example 9-1 0.

86

Example 9-10 Named Bit Fields

#include <inttypes .h> typedef union freebyte { int32_t _3byte:24; int32_t _1byte:8; } mystruct; #define set _int(s, x) (s) . _3byte = x #define get_int(s) (s) ._3byte #define set _char(s, x) (s) . _1byte = x #define get_char(s) (s) ._1byte

Using bit fields in this manner grants the same memory footprint reduction and removes endianism problems. It also makes it possible to remove the accessor macros because each element is now a legal element.

9.6 Unions

Applications that use unions and make assumptions about the data layout within that union will have endian portability problems. Example 9-1 1 shows one such union. Example 9-11 Endianism and Unions

#include <inttypes .h> union int_byte { int32_t int_val; char byte[ 4]; }; union int_byte my_var; my_var.int_val = 1000000; if(my_var.byte[ 3] == 0) printf ("The number is divisible by 256\n");

On a big-endian machine, this code works correctly; byte[3] is 0 only when the number is 0 or a multiple of 256. However, on a little-endian machine, byte[3] is the most significant byte. The easiest way to avoid this problem is not to try to outsmart the compiler. You can achieve the same results and avoid problems by using endian-independent code. For example:

if ((my_var.int_val & 0xFF) == 0) printf ("The number is divisible by 256\n");

Or better still:

if ((my_var.int_val % 256) == 0) printf ("The number is divisible by 256\n");

87

9.7 Initializing Multiword Entities in 32-Bit Chunks

Use care when porting code that initializes multiword entities with 32-bit entities. For example, on a big-endian system, such as Solaris, an array of two 32-bit integer values is used to initialize a 64-bit double:

u.ul[ 0] = 0xFFFFFFFF; u.ul[ 1] = 0x7FFFFFFF;

Literals like those above must be chosen with respect for the endian type of their final representation. To produce the correct results on a little-endian system, such as on Linux, you must reverse the subscripts to represent the correct byte order. For example:

u.ul[ 1] = 0xFFFFFFFF; u.ul[ 0] = 0x7FFFFFFF;

When possible, always initialize data elements using their natural type. The language standards include support for constant initializers large enough to initialize the largest supported data types. The limits .h and float .h header files contain information on the sizes as well as the macros for maximum and minimum values for numeric data types as defined by the ISO C standards.

9.8 Hexadecimal Constants Used as Byte Arrays

An endian problem occurs when a 32-bit value is treated as both a 32-bit value (an integer) and as an array of 4 characters. For example, the following array is equivalent to the number 0x44332211 on little-endian machines and the number 0x11223344 on big-endian machines:

char a[ 4] = { 0x11, 0x22, 0x33, 0x44};

Values that are masked using constants can also affect the result when a particular byte order is expected.

9.9 Other Considerations

A trick in common use in little-endian code that is forbidden in cross-platform work is casting a pointer to an int to a pointer to a char and assuming that the least-significant byte will be at the address pointed to. For example:

unsigned int value = 0x03020100 unsigned int *ptr = &value; unsigned char charVal; charVal = *(unsigned char*)ptr;

88

On a little-endian system, charVal is assigned the value of 0. On a bigendian system, it is assigned the value of 3. You do not want to use such code in your program, but it is very common. In old code written for a littleendian platform, it is one of the hardest things to find and root out. To accomplish the same thing in a portable way, use a temporary variable:

unsigned int temp = *ptr; charVal = (unsigned char) temp;

The assignment in the second line will take its value from the least significant byte on every architecture, whether it is at the high or low end of the temporary variable. The compiler handles the details for you. Also, you should do endian conversion on input and output and not in the middle of compute routines. This may be obvious but it is sometimes overlooked.

89

10 Porting from 32 Bits to 64 Bits

The Solaris architecture supports both 32-bit and 64-bit applications, but historically much of the existing code remains 32 bit. Both Solaris and Linux support the LP64 model such that porting 64-bit code from Solaris to Linux should not be affected by 32-bit to 64-bit issues . Re fer to http://archive.opengroup.org/public/tech/aspen/lp64_wp.htm for more information on the LP64 memory model. Over the past decade, 64-bit computing has been used heavily in the high performance technical computing community. More recently, with the advent of the Intel ® Itanium2®, the AMD Opteron TM, and the Intel® EM64T chips, 64- bit computing is rapidly moving into the mainstream, both in the server market and on the desktop. Porting from 32 bits to 64 bits can be reasonably straightforward when solutions to portability issues are applied to the original 32-bit program. Some existing 32-bit code might simply work correctly in the 64-bit environment. But some legacy applications, in which function prototypes or portable programming techniques were not used, can require greatly increased porting times. This chapter describes common 32-bit to 64-bit porting issues and suggests ways to avoid them.

10.1 Why Port to 64 Bits?

A 64-bit application can directly address 4 exabytes(263) of virtual memory, and the Intel ® Itanium® processor provides a contiguous linear address space. Another advantage is larger file size. Linux 64-bit systems allow file sizes up to 4 exabytes by default, a significant advantage to servers accessing large databases. On 32-bit systems, a 2-gigabyte limit is generally the default of both the kernel and the file systems. While 64-bit math is available on 32-bit systems using the C language long long data type, the hardware support leads to significantly faster 64-bit computations. Scientific calculations normally rely on floatingpoint mathematics, but financial calculations need a narrower range and higher precision than floating-point math offers. Sixty-four-bit math provides that higher precision fixed-point math with an adequate range. The current C language standard allows the long long data type to be at least 64 bits, though an implementation can define it as a larger size.

90

Dates are another important improvement. The traditional Linux date is expressed as a signed 32-bit integer representing the number of seconds since January 1, 1970. This date overflows to an invalid date in 2038. A 64-bit system defines the date as a 64-bit signed integer. The change to 64 bits is very important in the financial industry because it uses many instruments that need to project dates well into the future. Although 64-bit Linux distributions running on Intel® EM64T and AMD64 hardware supports running IA-32 executables, these executables continue to run as 32-bit applications with all of the normal 32-bit limitations. This also requires that the application developer verify and manage a full set of 32- bit, IA-32 libraries on the 64-bit system.

10.2 Linux 64-Bit Environment

The "Long, Pointer 64" (LP64) convention essentially states that pointers and long integers use 64 bits, while the int data type is 32 bits. The "Integer, Long, Pointer 32" (ILP32) is the convention used for 32-bit Linux and Windows. Table 10-1 compares the LP64 and ILP32 data types. The "Long-Long, Pointer 64" (LLP64) is another 64-bit model that has been adopted for use in the 64-bit implementations of Microsoft Windows products.

Table 10-1 Data Types and Conventions

Data Type LP64

Char Short Int Long long long Pointer

Convention Based on Bits ILP32 8 16 32 32 64 32 8 16 32 32 64 64 LLP64

8 16 32 64 64 64

Floating-point types float and double are not part of the standard because they are 32 and 64 bits, respectively, on both 32-bit and 64-bit systems. There are also some proprietary chip-based floating-point formats on some systems. Integer literals are either int or long (signed or unsigned). By default, the

91

compiler uses the value of the literal to determine the size. The suffix L denotes that the literal is a long integer, and the suffix U denotes that the literal is unsigned. On most systems, the compilers align data types on a natural boundary. This means that 32-bit data types are aligned on a 32-bit boundary, and 64bit data types are aligned on a 64-bit boundary. In a structure, this means that a filler (or padding) could be inserted by the compiler to enforce this alignment. The structure itself is also aligned based on its widest member. This means that on a 64-bit system, a struct or union itself may be aligned on a 64-bit boundary. Example 10-1 and Table 10-2 shows how this alignment may be applied by the compiler: Example 10-1 Typical C Structure

struct align { int a; double b; int c; long d; };

On the 32-bit system, the compiler might not align the variable b even though it is a 64-bit object because the hardware treats it as two 32-bit objects. The 64-bit system aligns both b and d, causing two 4-byte fillers to be added. Note that if an application makes incorrect assumptions about the size and location of elements in a structure, and uses fixed offsets to access members of the structure, the application may not work correctly when run on a 64-bit platform. For example,

stru ct ali gn aa ; double bb = aa. (a+4);

This example will work as expected on a 32-bit platform, but is incorrect on a 64-bit platform due to the 32-bit padding that is added to the structure after element a.

92

Table 10-2 Natural Alignments

Structure Declaration

struct align { int a;

32-Bit System Size 32 bits Offset

0x00 0x00

64-Bit System Size 32 bits 32 bits filler Offset

0x00 0x00

double b; int c;

64 bits 32 bits

0x04 0x0C

64 bits 32 bits 32 bits filler

0x08 0x10

long d; };

32 bits

0x10

64 bits

0x18

Structure size 20 bytes

Structure size 32 bytes

Refer to http://www.unix.org/version2/whatsnew/login_64bit.htm to read more about 64-bit programming as supported by the Single UNIX Specification, Version 2.

10.3 Porting Issues

The C language was designed in the early 1 970s as a system implementation language, and as a result of that heritage, the primitive data types, such as int were defined based upon the most efficient size for the target processor. The designers also created a set of type conversion rules on how arithmetic variables of different types operate together. The integer types range from the char type to the long long type. Each of these types also include signed or unsigned variants. The C language standards define the minimum sizes and the relationship between these types, but the actual sizes of these types are defined by the implementation. In theory, an int could be 64 bits, while a long could be 128 bits. Some other languages, such as FORTRAN, are somewhat immune to 32bit and 64-bit issues, but may have other portability issues. The following sections attempt to indicate where many of the trouble spots occur.

93

10.3.1 Standard Type Definitions

Some type definitions are of special interest to developers because their use enhances the portability of the code. Table 1 0-3 lists some common predefined type definitions.

Table 10-3 Common Type Definitions

Type Ptrdiff_t size_t Description

This is a signed integer type of the result of subtracting two pointers. This is an unsigned integer and the result of the sizeof (3) operator. This is used when passing parameters to functions, such as malloc(3), or returned from several functions, such as fread(2) These are type definitions that define integer types of a predefined width. These are defined in the 1999 C standard, and are available on Linux. These define integer types to which any valid pointer to void can be converted.

int32_t, uint32_t, int64_t,... intptr_t, uintptr _

10.3.2 Pointers and Integers

As mentioned in Section 1 0.2, Linux supports the LP64 model, in which pointers and long integers are 64 bits and int is 32 bits. A common programming mistake is that pointers are sometimes assigned to int and vice versa. When a pointer is assigned to a 32-bit int, the highorder 32 bits are truncated. Conversely, when a 32-bit signed int is assigned to a pointer, it will be sign extended to a 64-bit long before being assigned. For example, the following C program fragment:

int foo; void bar, pfoo = &foo; /* &foo == 0x60000fffffffb340 */ foo = pfoo; /* truncate 64-bit pointer to 32-bit int */ bar = foo;

will cause the variables pfoo and bar to contain the following values:

pfoo: 0x60000fffffffb340 bar: 0xffffffffffffb340

10.3.3 Expressions

Expressions in the C and C++ languages are based upon associativity and precedence of operators and on a set of arithmetic promotion rules. In general, a simple addition between two signed ints results in an expression that is a signed int. When an int is added to a long, the expression itself

94

becomes a long expression. If one of the operands is an unsigned int and the other is a signed int, the expression becomes an unsigned int. In a 64bit environment, an unsigned 32-bit expression may be promoted to an unsigned long as a result of passing it as a parameter, assigning it to a 64bit value, or possibly by being promoted during the expression evaluation. In this case the sign bit is not propagated. Table 1 0-4 lists some simple expressions and the type of the expression assigned by C and C++.

Table 10-4 Expression Types

Data Expression Expression Type unsigned unsigned long Comment

int i; unsigned u; int i; unsigned long ul; int i; double d; 0xFFFFFFFF

i + u i + ul;

is promoted to a signed long and added to ul. The sign bit is propagated before the addition.

i

i + d;

double

is converted to a double before the addition.

i

unsigned

in

This constant will not sign extend when assigned to a 64-bit object.

10.3.4 Bit Shifting

Bit shifting causes quite a bit of confusion because two expressions are involved: the bits to be shifted and the shift operand representing the number of bits to shift. The first operand defines the type of the expression, not the second operand. Table 1 0-5 lists the expression types and results of shifting a bit left 31 bits. The type of the second operand has no effect on the type of the expression. Table 10-5 Shift Expressions on a 64-bit Machine

Expression

long val = 1 long val = 1 << 31

Type

int << 31UL int long unsigned long

Result

-2147483648 or 0xffffffff80000000 -2147483648 or 0xffffffff80000000 2147483648 or 0x80000000 2147483648 or 0x80000000

long val = 1L << 31 long val = 1U << 31

The difference occurs because in the first and second cases, the right side of the expression is an int. This is then sign extended and then assigned

95

to val. For the third case, the right side is a long, so that the bit is shifted 31 bits left, but because it is a 64-bit expression, no sign extension occurs on the assignment. In the fourth case, the right-hand side expression is an unsigned 32-bit expression; therefore, no sign extension occurs.

10.3.5 Function Parameters

In C and C++, parameters are normally passed to functions by value. In addition, C++ has a call by reference feature. In all cases, the parameters are fully evaluated first, whether they are single variables, constants, or expressions. The order of the evaluation of the parameters is unspecified and may be different, not only on different systems, but also on the same system. The C language standard defines three basic function prototype declarations. An example prototype for a function which takes parameters is:

double AMathFunction(double, int);

In this case, all the parameters are fully defined. There is another case in which C allows a variable number of parameters. In this case, a function can take an unknown number of parameters:

int printf(const char *, ..j;

The ellipsis (...) tells the compiler that the caller of the function may provide more than the single parameter. Additionally, there is no type checking on the additional parameters. A function which takes no parameters should have the single type specifier void as its parameter list, like this:

int foo (void);

A fourth case is provided for compatibility to legacy C applications, where a function prototype is either not included at all or only a function declaration is included. The function declaration contains no parameter list, and similar to the variable parameter list discussed previously, there is no type checking. This compatibility is frequently termed "K&R" for Brian Kernighan and Dennis Ritchie, the inventors of C. When the data type is defined by a function prototype, the parameter is converted to that type according to the standard rules. When the data

96

type is not defined, the parameter is promoted according to the usual promotion rules defined by the standard. When the type of a parameter is not specified, the parameter is promoted to the larger of the type of the parameter or the standard. In a 64-bit system, integral types are converted to 64-bit integral types, and single-precision floating-point types are promoted to double-precision. If a return value is not otherwise specified, the default return value for a function is int. While the C++ language requires fully prototyped functions, C does not. Function prototypes should always be used in a C program to support strong data typing and exploit the error reduction properties of prototypes. Also, the use of function prototypes improves performance by reducing the additional code used in the promotion and demotion of the data. The use of function prototypes can also expose latent bugs that might exist in a program, and significantly aids porting applications to 64 bits. Parameters behave as expressions, and are evaluated before being promoted. Example 10-2 shows this case. Example 10-2 Parameter Expression

long testparm(long j) { return j; } int main() { int i = -2; unsigned k = 1U; long n = testparm(i + k); }

On a 64-bit system, the result here is 4294967295 because the expression i+k is an unsigned 32-bit expression, and when promoted to a long, the sign does not extend. Additionally, many systems use registers rather than the stack to pass parameters. While this should be transparent to most programs, one common programming trick can cause incorrect results. Consider the following code fragment to print the hexadecimal value of a float:

float f = 1.25; printf ("The hex value of %f is %x\n", f, f);

On a stack-based system, the appropriate hexadecimal value is printed; but in a register-based system, the hexadecimal value is read from an integer register, not from the floating-point register. A suggested solution is to use a pointer as follows:

97

printf ("The hex value of %f is %x\n", f, * (int *)&f);

In this case, the address of the floating-point variable f is then cast to a pointer to an int, which is then dereferenced. Doubles are generally 64 bits wide and comply with the IEEE-754 floatingpoint standard on both 32-bit and 64-bit systems.

10.3.6 Numeric Constants

Integer constants, such as 1234, are generally taken as signed 32-bit integers. The suffix L indicates a long constant, as in 1234L. The suffix U indicates an unsigned constant, and can be used either alone or with L. Hexadecimal constants are commonly used as masks or as specific bit values. Hexadecimal constants without a suffix are defined as an unsigned int if they fit into 32 bits and if the high order bit is turned on. Table 1 0-6 lists the properties of numeric constants.

Table 10-6 Numeric Constants

Constant

0xFFFFFFFF 0xFFFFFFFFL

Constant Type 32-bit unsigned int. This is a signed long. On a 64-bit system, only the low-order 32 bits are set, resulting in the value 0x00000000FFFFFFFF. 32-bit signed int.

0x7FFFFFFF

If you want to turn all the bits on, a portable way to do this is to define a signed long constant with a value of -1. On an implementation that uses two's complement arithmetic, as virtually all mainstream implementations do, this turns on all the bits.

long x = -1L;

Another common construct is the setting of the most significant bit. On a 32- bit system, the constant 0x80000000 may be used. A more portable method of doing this is to use the defined constant LONG _MIN, which is found in the limits .h header file. Another method is illustrated in the following example, where the size of char is 8 bits:

1L << ((sizeof(long)*8) ­ 1);

Because this is a constant expression, the compiler folds this expression into the appropriate constant so that this will work on a 1 6-bit, 32-bit, or 64bit system.

98

10.4 Porting Tools

Several tools are available to help with the porting of 32-bit applications to a 64-bit environment. However, because some of these tools are not specifically designed as tools for porting code from 32 bits to 64 bits, they may miss some problem areas. Many of the environmental anomalies may not be apparent until run time. It is important to use a solid testing strategy.

10.4.1 Intel ® C++ Compiler

The Intel® C++ Compiler has a mode to specifically report diagnostics for 64-bit porting. Using the compiler flag, ­Wp64, produces some useful warnings. You can obtain the Intel ® C++ compiler at: www.intel.com/software/products/compilers/linux/. The code fragment in Example 10-3 contains a typical 32-bit to 64-bit coding problem. Example 10-3 Typical 32-Bit to 64-Bit Code Porting Problem long anumber = 5; int number; number = anumber; The Intel® C++ Compiler will produce the following warning with -Wp64 enabled which aids in identifying this porting problem: warning #810: conversion from "long" to "int" may lose significant bits.

10.4.2 The GNU Compiler Collection (GCC)

The GNU Compiler Collection (GCC) includes compilers for C, C++, and other languages. GCC is normally supplied by the distribution vendor. Additional documentation and alternate GCC versions are available at:

gcc.gnu.org.

10.4.2.1 GCC

GCC is the name of the GNU C Compiler. You can use it to help clean up an application, but it does not contain any specific 32-bit to 64-bit diagnostics. Using the -Wall, -ansi or -pedantic flags will show some questionable issues. The code fragment in Example 10-4 contains another typical 32-bit to 64-bit coding problem.

99

Example 10-4 Another Typical 32-Bit to 64-Bit Porting Problem

long anumber; char *number _string = "123"; sscanf (number _string, "%d", &anumber);

GCC using -Wall issues the following warning:

warning: int format, different type arg (arg 3)

In this example, argument three, &anumber, is not the appropriate size for the %d conversion specification.

10.4.2.2 G++

G++ is the GNU C++ compiler. While G++ shares most of the same flags with GCC, there are some additional flags that are useful, such as -Wabi. The -Wabi option indicates that the compiler might be generating code that is not vendor neutral. 10.4.2.3 GDB GDB is the GNU Debugger, and the standard debugger for Linux systems. The GDB debugger is included with most Linux distributions, but is also freely available for download at www.gnu.org/software/gdb/gdb.html. 10.4.2.4 Data Display Debugger (DDD) DDD is the Data Display Debugger. This is included with most distributions, but is also freely available for download at www.gnu.org/software/ddd/. DDD is layered on top of GDB and provides a source window, but also provides a dynamic data display that can aid developers in locating hard to find anomalies. 10.4.2.5 Splint Splint is a tool for statically checking C programs, including 32-bit to 64-bit diagnostics. This is a free version of the venerable lint command found on many UNIX systems. The -strict flag is useful in porting although it is quite verbose. Splint is normally included with most Linux distributions. Additional information is available at www.splint.org.

100

11 File System and Cluster Considerations

A large number of file system and clustering options are available to ISVs porting applications from Solaris to Linux. This chapter describes the file systems and volume managers offered for Linux. This chapter also provides an overview of the clustering options available for Linux solutions. There are two types o f c lus te rin g so lution s : H igh -Av ailability Clus te rs and High-Performance/Computational Clusters. This chapter focuses on the High-Availability Clusters and provides a high-level feature comparison of Sun Clusters, HP Serviceguard, and VERITAS clustering solutions. Other clustering options not covered in this chapter, but worth noting are the OpenSSI project and Lustre.

11 .1 File System Support

With support for more than 30 different types of file systems, applications ported from Solaris to Linux should have no problem finding the capacity and features they need. The list of supported Linux file systems includes Ext2, Ext3, ReiserFS, JFS, VxFS, FAT32, NTFS, NFSv2, NFSv3, UFS, HSFS 13, UDF14 and more. The default file system for Red Hat is Ext3, whereas SUSE uses ReiserFS. Both ReiserFS and Ext3 file systems support journaling and multiterabyte storage on 2.6 and newer kernels. Solaris supports a smaller number of file systems, primarily to support different storage media types. HSFS13 is supported for CD-ROMs, UDF14 for DVDs, and UFS is the default file system supported for hard disk devices on Solaris. The UFS on Solaris has default logging enabled, making it a journaling file system. Linux also supports UFS file systems. UFS on both Solaris 8 and Linux 2.4 kernels (and earlier) is limited to 1 terabyte of file system storage. But in Solaris 9, and with the Linux 2.6 kernel, support has been enabled for multiterabyte UFS. Both systems provide large file support (a 64-bit kernel is required for Solaris). Additionally, both Solaris and Linux support AutoFS for mounting NFS resources, and local-memorybased TMPFS for /tmp. TMPFS is generally not the default on Linux for the /tmp directory. This file system type is well supported by the Linux kernel and is used by glibc 2.2 and later to implement POSIX shared memory.

13 14

High Sierra ­ UNIX like file system extension to ISO9660 (CDFS) for CD-ROMs. Universal Disk Format - for DVDs.

101

Use the following links to start researching some of the more widely used Linux file systems if you need additional information: Ext2/Ext3: e2fsprogs.sourceforge.net/ext2.html ReiserFS: www.namesys.com/ JFS: jfs.sourceforge.net/

11.2 Logical Volume Managers (LVM)

Logical volume managers (LVMs) are storage management tools that allow the creation of a virtual disk or virtual volume which consists of one or more physical disks. The logical volume manager runs as a layer beneath the file system, translating requests for logical disk volumes into physical device commands. The logical volume manager can represent several small disks as one large virtual disk (disk spanning), or one large disk as several smaller disk partitions (disk partitioning). Thus, large files can span multiple disk units. Without a logical volume manager, file systems and individual files remain limited to a size no larger than the size of the physical disks. This becomes a problem for data-intensive applications. Combining several disks to form a logical volume can increase capacity, reliability, and performance. Unlike more primitive file system and physical partition approaches, logical volumes allow administrators to manipulate them on line, without requiring a reboot. Logical volume managers oversee disks in terms of logical volumes, not physical ones. A number of open source projects provide volume management features for the Linux platform: Software RAID driver (also know as the MD driver). Refer to mdadm (1) and cgi.cse.unsw.edu.au/~neilb/mdadm for more information). Logical Volume Manager (LVM) and Enterprise Volume Management System (EVMS). Both are front-ends to the Linux device mapper. See sourceware.org/dm for additional info on device-mapper. Enterprise Volume Manager (EVMS) is another common volume management package for Linux. For more information on EVMS, refer to evms.sourceforge.net.

102

The concept behind RAID is simple; it enables the combination of two or more devices into a single RAID device. There are two approaches for creating RAID devices, one uses special RAID hardware and the other implements the RAID functionality in software. Hardware RAID subsystems enable redundancy and stripping at the hardware level. HP hardware RAID solutions include Smart Arrays, Modular Smart Arrays (MSA), Enterprise Virtual Arrays (EVA), and XP storage solutions. In addition to support for hardware RAID, Linux includes Software RAID that enables the creation of RAID devices, using software and without the need for hardware RAID controllers or enclosures. For example, you can combine three empty partitions such as hda3, hdb3, and hdc3 using RAID to create a single RAID device, /dev/md0. This RAID device can then be formated to contain a file system or it can be used like any other Linux device. RAID can dramatically increase disk IO performance and reliability. Using either or both of these solutions allows partitions or devices to be combined in one of many ways to provide fault tolerance and performance benefits over a single device. Similarly, LVM2 and LVM enable you to combine two or more disks into a virtual disk, albeit without redundancy. LVM also provides additional features, such as the ability to create volume snapshots, resize volumes, and relocate data to another volume without unmounting the file system. If software redundancy is necessary, you can use LVM on top of Software RAID to create volumes with RAID redundancy. EVMS is a recent addition to open source. EVMS manages storage in a way that is more flexible than other Linux volume management systems by providing a plugin architecture that allows for the management of other volume management technologies, such as Software RAID and LVM. Practical tasks, such as migrating disks or adding new disks to your Linux system, become more manageable with EVMS because EVMS can recognize and read from different volume types and file systems. EVMS provides additional safety controls by not allowing commands that are unsafe. These controls help maintain the integrity of the data stored on the system. With EVMS, you can resize volumes, create volume snapshots, and set up RAID features for your system. Another valuable feature of EVMS is that it enables you to manage data on storage that is

103

physically shared by nodes in a cluster. This shared storage allows data to be highly available from different nodes in the cluster. In addition to the open source offerings, VERITAS has released a suite of products that can be found at www.veritas.com/van/technologyzone/linux.jsp. VERITAS Storage Foundation combines the VERITAS Volume Manage (VxVM) and the VERITAS File System (VxFS) to provide a complete solution for online storage management. In addition to many of the features of volume management, such as creating snapshots and resizing volumes, VERITAS Storage Foundation gives you the flexibility to move data between different operating systems and storage arrays, balance I/O across multiple paths to improve performance, replicate data to remote sites for higher availability, and move unimportant or out-of-date files to less expensive storage without changing the way users or applications access the files. Users of the Sun Solaris Volume Manager CIM/WBEM API may find that the level of support for WBEM/CIM in base Linux distributions is poor. HP provides WBEM solutions for Linux that are available at www.hp.com/go/wbem. HP supplied WBEM providers for Linux include disk drive, disk partition, logical disk, network adapter, PCI device, physical media, physical memory, power supply, SCSI controller, and others. HP also provides WBEM clients: WBEM enabled management applications that provide the user interface and functionality you need to manage your environment. Another project of interest that is working to provide WBEM services for Linux is SBLIM (sblim.sourceforge.net/ index.html).

11.3 Sun Clusters ­ Background Summary

Sun Cluster 3.1 is the current version of clustering software available from Sun. This software provides high availability for both x86 and SPARC based hardware systems supported by Sun. Sun Cluster 3.1 runs on Solaris 8 and Solaris 9 operating systems. The clustering solution consists of the Sun Cluster software, the Solaris operating system (OS), supported Sun servers, public networks, cluster interconnects, and Sun or third-party storage solutions. A maximum of 16 nodes are supported in a Sun cluster configuration. A list of supported hardware and software components is available at www.sun.com/software/cluster/ds/ds-cluster31/.

11.3.1 Cluster Configuration Options

104

The clustering solution provides either a single-instance application failover or a distributed application management environments: Single-instance application failover Applications can be configured to run on one node as a single-instance, and can fail over to the second node in case of a failure. This solution provides high availability for the application. Distributed application management environment This option allows for multiple-instance application configurations in the cluster environment. The user application service can be distributed over a number of nodes in the cluster, providing high availability and application scalability. A global IP address service is available for load balancing these applications. Cluster tools and cluster APIs help with application configuration in the cluster environment, creating application dependencies, taking applications off line and on line, monitoring application health, and managing cluster resources. Sun provides a list of preconfigured "qualified agents" for creating HA and scalable applications. It also provides the capability to build custom agents for new applications. More information on Sun clustering options and architecture is available at docs.sun.com/app/docs/doc/817-6536.

11.3.2 Manageability

Sun Cluster provides a centralized cluster management that enables a single point of administration. Cluster management tools include GUI-based SunPlex Manager and a command-line interface called Sun Cluster CLI. These tools configure, manage, and monitor the cluster configuration. Resource groups, shared devices, volume management, and user applications can all be managed with these cluster administration tools. Fault monitors are available for applications, file systems, disk paths, and networks to monitor resource status and detect failures.

11.3.3 Cluster Communication

Cluster interconnects enable internode cluster communication. All nodes in the cluster are physically connected through a minimum of two interconnects. Standard fast Ethernet and gigabit Ethernet can be used for Sun cluster interconnects. For special configurations requiring high-speed communication, Sun Cluster provides Remote Shared Memory (RSM)

105

technology. The RSM API allows the applications to bypass the TCP/IP stack and access a high speed, low latency hardware interconnect for internode communication. RSM requires Sun proprietary Scalable Coherent Interconnect (SCI-PCI). More information about cluster interconnects is available at docs.sun.com/app/docs/doc/817-6536/6mltstlhi?a=view.

11.3.4 Resource Configuration

Sun Cluster provides abstract resources, which are a set of global networks, devices, and file services that are available to all nodes that are directly or indirectly connected to these resources. Sun provides both local and global devices. As the name suggests, local devices are private to each node in the cluster. Global devices, in contrast, are available clusterwide to all the cluster nodes. The cluster software manages the global devices using a device ID driver (DID). The DID is used to assign unique IDs to the devices, making each device uniquely identifiable in a cluster. Global devices include disks, tapes, and CD-ROMs. Currently, only disk devices are highly available in the cluster environment because they have multipath and multihost support enabled. Tapes and CD-ROMs are accessible from all nodes in the cluster, but they are not highly available.

11.3.5 Cluster File System

Applications can access a cluster file system in the Sun Cluster environment. The cluster file system uses the underlying global devices available to various nodes in the cluster. Disk devices with multiple host paths can be used for creating highly available cluster file system for applications. The cluster file system provides a highly available, file-locking functionality using the fcntl(2) interfaces. This allows applications running on different nodes to synchronize data access. But the Sun cluster file system is independent from the underlying operating system file system. The operating system's file systems are not part of the cluster file system; instead, they are private to each node in the cluster. User accounts and profiles are not shared among the various cluster members by default, but they can be manually configured to be identical.

11.3.6 Application Programming Interfaces (APIs)

Sun Cluster provides several methods for creating and managing highly available and scalable data services, or "agents," in the clustering environment. The programming options available are:

106

Low-level cluster API The low-level API is called the Resource Management API (RMAPI). It provides a set of functions allowing services to collect information about various cluster resources, configure, and modify their states. These functions are available both as C functions and as command-line operations. The libscha .o library contains the RMAPI routines. High-level cluster functionality library The next level of programming interface is the Data Service Development Library API (DSDL API). The DSDL API is built on top of RMAPI, and provides functionality for creating data services. DSDL API routines are available in the libdsdev. o library and are the building blocks for application data services. You can use the DSDL API to create customized application data services. A tool that automatically generates the various data services for applications in the cluster environment Another tool provided by Sun Clusters is the SunPlex Agent Builder. This tool automates the process of creating data services and application packages using the previously mentioned APIs. The SunPlex Agent Builder is useful for simple applications only. Applications with complex requirements and dependencies are typically created by manually writing customized code using the DSDL API. Refer to docs.sun.com/app/docs/doc/816 - 3385 for additional information on the Sun Cluster API.

11.4 HP Serviceguard Clusters ­ A Comparison

The HP Serviceguard Clusters product offers a comprehensive alternative for ISVs looking for a clustering solution on Linux. HP Serviceguard for Linux provides high-availability solutions and disaster-tolerant solutions on a large range of HP servers and storage. The discussion in the following sections focuses on the high-availability offering of HP Serviceguard. The latest version of HP Serviceguard A. 11 .16 for Linux supports both Novell SUSE Enterprise Server 9 and Red Hat Enterprise Linux 3. The servers supported by HP Serviceguard include HP ProLiant servers with both 32-bit and 64-bit extensions and 64-bit HP Integrity servers. A more detailed description of the product features is available at

107

www.hp.com/go/serviceguard.

11.4.1 Cluster Configuration Options

HP Serviceguard allows applications to be configured for single-instance failover or multiple-instance availability environments. Single-instance failover offers high availability to an application in case of a hardware failure. The application configured in a single-instance environment fails over to the surviving node in case of a failure at the primary node. Multipleinstance configuration enables running the application on multiple nodes simultaneously and provides a distributed environment for the application. Multiple-instance configuration offers both high availability and scalability to the application. HP Serviceguard provides cluster configuration and management tools that you can use to create application "packages" to automate the process of taking applications on line and off line, and to define resource dependencies for these application packages. HP Serviceguard packages are similar in functionality to agents in Sun Clusters. Cluster management tools also monitor the cluster as a whole and its various components, including the application packages. A number of management tools, such as HP Serviceguard Manager, Network Manager, Package Manager, and Storage Volume Manager, configure and manage various components of the clustering solution. Serviceguard also provides preconfigured toolkits for frequently used applications.

11.4.2 Manageability

HP Serviceguard Manager is a cluster management tool that simplifies configuration, administration, and monitoring of a high-availability cluster solution. It provides a single point of management for multiple clusters. Serviceguard Manager is a GUI tool that manages the cluster as a whole; you can also use it to drill down to the level of individual nodes and application packages. You can use it to start, stop, modify packages, nodes, and clusters. Serviceguard Manager can be integrated with other HP management solutions, such as the HP OpenView portfolio of applications and HP Systems Insight Manager, for a more comprehensive, systemwide management of the solution. For more information on the HP OpenView offering, refer to openview.hp.com/solutions/.

108

HP Event Monitoring Services (EMS) can be used to provide high-availability monitoring for system resources such as disks and network interfaces. EMS provides a free set of APIs to hardware and software vendors to create monitors to integrate with existing EMS framework. Refer to h71028.www7.hp.com/enterprise/cache/4175-0-0-0-121.html for more information on cluster manageability.

11.4.3 Cluster Communication

HP Serviceguard requires continuous communication among the various cluster nodes via heartbeat messages. These messages can be communicated over any TCP/IP network configured as a heartbeat device. To avoid network congestion, it is recommended that you configure a private network as the heartbeat network. You can use regular Ethernet, fast Ethernet, or Gigabit Ethernet for cluster internode communications. Support for industry-standard, high-bandwidth InfiniBand for HP Serviceguard cluster interconnects is planned for a future release of the product.

11.4.4 Resource Configuration

HP Serviceguard packages are applications configured to run in a highly available environment. A package is a combination of application services and system resources, such as disks, disk volumes, and IP addresses. There are no global devices in HP Serviceguard, but shared storage devices are used as resources by application packages. These shared devices need to have direct paths to the servers, allowing them to fail over with the package from one node to another. Packages requiring network resources can be configured with relocatable IP addresses. These IP addresses are like virtual IP addresses that relocate to the secondary node's network interface in case of a package fail over. Channel bonding is also available and allows grouping of LAN interfaces. One interface in the bonded group stays active and transmits data, while the other network interfaces act as backup.

11.4.5 Cluster File System

Currently HP Serviceguard does not provide a cluster file system. Each node in the cluster has its own private file systems for the operating system and applications. Raw devices available on shared storage can be used by applications like Oracle RAC, which require sharing data among the various nodes in the cluster. HP Serviceguard plans to support the Red Hat Global File System (GFS) in the future. Similar to Sun Cluster, HP

109

Serviceguard does not allow user accounts and profiles to be shared across the cluster, but identical configurations can make the account management process easier.

11.4.6 Application Programming Interfaces

HP Serviceguard provides toolkits for a large number of applications, such as Apache, MySQL, Oracle, SAP, Samba, NFS, and more. These toolkits contain preconfigured instructions for making applications highly available in the Serviceguard environment. A toolkit gets installed from an rpm package and consists of a set of Bash shell scripts that start, stop, and monitor an application in the cluster environment. You can create custom packages with the help of existing templates. You can specify various options for resources, failover policies, and scripts to configure customized application packages. For a list of available toolkits for Linux, refer to www.hp.com/go/serviceguard and follow the Linux link under the High availability solutions section.

11.5 VERITAS Clusters ­ A Comparison

VERITAS has an extensive set of products that enable high availability for Linux solutions. For ISVs migrating from the Sun Cluster product, VERITAS offers a comprehensive solution to match their requirements. The highavailability products from VERITAS are available for a large number of platforms, including Solaris, HP-UX, and Linux. VERITAS provides software solutions with common technology, architecture, and management across multiple platforms, making it a viable alternative for heterogeneous environments. The VERITAS Cluster Server product supports both Red Hat Linux 3.0 and SUSE Linux 8. A more detailed product compatibility list is available at www.veritas.com/Products/

11.5.1 Cluster Configuration Options

VERITAS offers a number of high-availability solutions, but the core clustering product is VERITAS Cluster Server (VCS). Similar to the Sun Cluster configuration, applications in the VCS environment can be configured for single-instance and multiple-instance deployment. The single-instance application failover is defined under the Failover Service Group in VCS, and the multiple-instance application forms the Parallel Service Group. Another option, called the Hybrid Service Group, allows a combination of single and

110

multiple-instances across VERITAS defined system zones. VERITAS Cluster Server creates a framework for configuring and managing applications to provide high availability and scalability. The VCS environment defines resources, resource dependencies, service groups, and agents. The agent processes manage all the applications and cluster resources in the VERITAS cluster environment. Agents perform operations on applications such as bringing them on line, taking them off line, monitoring, collecting information, and other customized actions. VERITAS offers a large number of preconfigured agents to monitor various applications and databases. The VCS framework also provides tools that manage the cluster controls, communication among members, synchronization of various components, group memberships, and a number of background processes for monitoring resources. Other high availability offerings from VERITAS include Cluster Server Traffic Director for load balancing, Global Cluster Manager for distributed clusters, and Volume Replicator for data replication. Refer to www.veritas.com/Products/www?c=product&refId=20 for more information.

11.5.2 Manageability

VERITAS Cluster Server can be managed using a command-line interface (CLI) or a graphical user interface (GUI). Both Java and Web-based GUI cluster managers are available. The Cluster Manager tool can be used to start, stop, and monitor VERITAS Cluster Server, configure administration privileges, and manage multiple nodes in a cluster. A large number of CLI utilities are available to query, modify, create, and delete resources, service groups, and agents. The GUI-based Cluster Manager is useful for overall cluster monitoring, querying, and modifying various cluster services. VERITAS provides two additional management products: Global Cluster Manager and CommandCentral Availability. These products provide more comprehensive, long-distance, multicluster management capabilities.

11.5.3 Cluster Communication

VERITAS Cluster Server configuration does not require a special hardware interconnect for cluster communications but uses private network paths. It uses a Low Latency Transport (LLT) protocol for this communication. LLT serves as a low-latency and high-performance replacement for the IP stack. LLT is used for the cluster heartbeat traffic, internode communication, and even distribution of network traffic among the various private networks.

111

VERITAS suggests a minimum of two private network paths between all cluster nodes, to reduce bottlenecks and to provide redundancy for cluster communications.

11.5.4 Resource Configuration

Resources in VERITAS Cluster Server (VCS) include network interfaces, IP addresses, disk devices, file systems, and applications. These resources are configured and managed by VCS and can have dependencies defined on other resources. VCS creates service groups, which bring together sets of resources under a common logical group unit. An application service group can be defined to run as a failover group or as a parallel group depending on the high-availability requirement of the application. The disk devices need to have direct paths to the servers to enable them to fail over with the service group from one node to another. Services requiring network access define virtual IP addresses such that the virtual IP address moves to the secondary node in case of a service fail over. Agents are processes that monitor and manage different types of resources. They also communicate with the VCS regarding resource administration. VERITAS provides a large number of predefined agents for various applications (Apache, BEA WLS, IBM WebSphere, Oracle Apps, SAP, Siebel) and databases (Oracle, DB2, Informix, MySQL, and Sybase). Refer to www.veritas.com/Products/www?c=optionlisting&refId=20 for a list of VERITAS VCS agents.

11.5.5 Cluster File System

Storage Foundation Cluster File System (SFCFS) is a cluster file system product available from VERITAS that allows for multiple nodes in a cluster to access common file system storage. This product is commonly known as the VERITAS CFS. Storage with direct paths to the various node members can be configured as a cluster file system. VERITAS CFS is used mainly for applications sharing data and configuration. Applications need to implement application-level locking for simultaneous data access by multiple processes. Operating system file systems remain private to the node members, so there is no shared root. VERITAS CFS uses dynamic multipathing (DMP) for spreading I/O requests to improve performance and availability of storage. An I/O fencing mechanism protects against data corruption. For more information about VERITAS Storage Foundation Cluster File System, refer to www.veritas.com/Products/www?c=product&refId=209.

112

11.5.6 Application Programming Interfaces

VERITAS provides a large number of command-line interfaces (CLI) for managing the cluster server, administering resources, and managing various service groups. With appropriate user privileges, these CLIs can be used in shell scripts to automate configuration and monitoring for the VERITAS cluster components. VERITAS also provides an extensive set of APIs for creating user-defined agents. User-defined agents can be built in C++, Perl, or shell scripts. The VERITAS Cluster Server Agent Developer's Guide gives detailed description of the various APIs and how to use them to create customized agents. You can find the VCS 4.0 Agent Development by Example at www4.symantec.com/Products/van?c=product&refId=20.

11.6 Additional Information

Linux has several commercial and open source offerings: Linux Clustering Information Center ­ www.lcic.org HP StorageWorks Scalable File System (HP SFS) ­ www.hp.com/go/technicalstorage Red Hat Global File System (formerly Sistina) ­ www.redhat.com/software/rha/gfs/ OpenSSI ­ www.openssi.org Lustre ­ www.lustre.org OpenGFS ­ opengfs.sourceforge.net Cluster Project Page ­ sourceware.org/cluster Oracle Cluster File System (OCFS) ­ oss.oracle.com/projects/ocfs HP Serviceguard for Linux ­ www.hp.com/go/serviceguard Steeleye Lifekeeper ­ www.steeleye.com Polyserve Matrix HA ­ www.polyserve.com The Beowulf Project ­ www.beowulf.org Sandia Computational Plant (cplant) ­ www.cs.sandia.gov/cplant PNNL created an 11 .8 teraflops cluster using an HP Integrity Linux cluster solution ­ www.hp.com/techservers/clusters

113

12 Linux with Security Enhanced Linux (SELinux)

This chapter covers issues involved in porting code from Solaris to Linux with Security Enhanced Linux (SELinux) security module. Since SELinux is now enabled by default in some Linux distributions, it is also possible that your application will be installed on a system in this environment. It is possible that your application may have problems with the increased security or you may want your application to take advantage of the security features provided by SELinux. If either of these situations apply, this chapter will provide the information needed to better understand SELinux and to provide pointers to where you can learn more.

12.1 Background

The Linux operating system continues to evolve and add new features to the environment. One of the new features that has been developed is Security Enhanced Linux or SELinux. SELinux implements a mandatory access control (MAC) security policy in addition to the traditional Linux discretionary access control (DAC) policy. SELinux is installed and enabled by default in some Linux distributions, which means that there is a possibility that programs which run under the traditional Linux DAC security policy may not run on a system with SELinux. This chapter will give a brief introduction to SELinux and offer some suggestions to get programs to run in this environment. SELinux is one of several Linux Security Module implementations (see Linux Security Module 12.1.2). AppArmor provides an alternative approach. Unlike SELinux, AppArmor does not target Trusted Environments. AppArmor provides application-focused access controls. Each application is explicitly granted standard UNIX permissions, independent of the filesystem, to specific files and directories which are needed by the application. A default deny is applied to all other files and directories on the system. AppArmor also enables the mediation of POSIX capabilities 15 within processes.

15

POSIX capabilities are a partitioning of the all powerful root privilege into a set of discrete privileges that may be individually granted to, or removed from, different processes.

114

An easy to use graphical user interface and other tools are provided to assist in configuration of the security controls. The AppArmor distribution includes a number of default and sample policies for many internet facing applications such as Apache, Sendmail, Postfix and OpenSSH. Because policies are defined on a per-application basis, standalone applications will not be affected by running AppArmor. For more information on AppArmor see forge.novell.com/modules/xfmod/project/?apparmor.

12.1.1 Origins

SELinux is the result of collaboration between the US National Security Agency (NSA) and the open source community to implement a flexible mandatory access control policy in a widely distributed operating system. As enterprise systems are increasingly exposed to public networks, the danger of system compromise is greater and the traditional DAC security model is vulnerable to a number of exploits. MAC provides stronger security because the power of the root account is limited by MAC. MAC imposes strict rules on all users and processes. Therefore if a daemon process running as root is subverted, it will not be able to have full control or access to the system. MAC environments can be quite restrictive and difficult environments in which to get applications to run. SELinux, however, implements a flexible MAC policy. That is, while the MAC policy cannot be overridden, it can be modified in authorized ways, and once installed, is universal in enforcement. Furthermore, the default MAC policy in Linux targets enforcement of strict access controls on Internet facing daemon processes and leaves the rest of the system with mostly the standard DAC.

12.1.2 Linux Security Module (LSM) Framework

Although SELinux is installed by default in some Linux distributions, it is an optional part of any Linux environment because it is implemented as a Linux Security Module (LSM). The LSM Framework provides Linux with a standard set of kernel APIs for policy enforcement modules. These APIs are generic in nature, so the security module can implement additional security access controls. The security module is called whenever the kernel makes access decisions, as well as when the kernel creates new or deletes old subjects and objects. One important fact to note with the LSM framework is that the security module is called only if the traditional Linux security policy permits an

115

access. If the access is denied by the traditional Linux security policy, the security module will not be called. Thus, the security module is only more restrictive of Linux access control. It cannot grant what has been denied by traditional Linux access rules. SELinux implements access controls based on a security context. A security context is created for each subject (process) and each object (file, socket, pipe, and so forth) when subjects and objects are created by the kernel. The security context consists of security elements and the assignment of values to these elements is determined by the policy, not the user. Remember that this is mandatory access control, not discretionary access control. The element that is used primarily for access control is the type element and the MAC policy is called type enforcement. Note that this is also called domain type enforcement because is common to refer to the type of a process as a domain and say that the process executes in a domain.

12.1.3 Linux Distributions containing SELinux

Since the LSM framework is a standard part of the Linux 2.6 kernel, any distribution based on the Linux 2.6 kernel can support SELinux. However, not all distributions have packages for SELinux, while other distributions install and enable SELinux by default. Red Hat Enterprise Linux and Fedora Core install and enable SELinux by default and there are packages available for Debian, Gentoo, Ubuntu, and others.

12.2 Type Enforcement (TE)

Standard Linux DAC is based on the user and group of a process. When a process attempts to access a file or other object on the system, the owner and group of the object and the mode bits are compared to the effective user and group(s) of the process. If the mode bits are set to allow access, access is granted. If the effective user ID of the process is root or zero, all access is granted. The owner of the file or object is the one who may change the protection attributes of the object. SELinux does not base access decisions on the effective user or group identity of the process, but on the domain of the process and the type of the file or other object. SELinux maintains a separate set of attributes for each process and file or other object called a security context. The type attribute of the security context is used for access control. There is a default deny rule that prohibits all accesses between types. Explicit rules must permit any

116

accesses and the root account is also subject to these rules. There is no override to the type enforcement rules, even for the traditionally allpowerful root account. Processes still execute as root in order to pass traditional Linux security checks, but the type enforcement policy implemented by the SELinux security module requires explicit authorization for its MAC policy. Any access to any object must be explicitly authorized by the security policy. There are no exceptions. SELinux implements a finer grained access control set than the traditional Linux read, write, and/or execute. For example, there are 20 discrete access controls on files. In fact, there are different access controls for different classes of objects. The access controls for files are not the same as the access controls specified for sockets.

12.3 SELinux Policy

The traditional Linux DAC policy allows the file owner to set access controls to determine who has read, write, and/or execute permission to their files. The familiar owner and group IDs, together with the mode bits on a file, are the expression of this DAC policy. SELinux assigns a security context to each process and file or other object and enforces access controls based on the security context rather than the mode bits, owner, and group IDs of the file. The collection of rules that assign the security context and enforce access controls is called the security policy. All accesses must be explicitly defined because there is a default deny rule in effect. The policy is compiled and read into the kernel at boot time. Unlike traditional Linux DAC policy which has no options, SELinux offers a choice of policies. Some Linux distributions provide several precompiled policies from which to choose. Sources are also available. The default targeted policy, which is available with Red Hat Enterprise Linux 4, and Fedora Core 5, should have a minimal impact on most user applications. For this reason, it is recommended that people who want to learn more about SELinux consider one of these (or later) distributions. SELinux enforces access controls based on the security context. The security context consists of three security elements: user:role:type. The type element is used for access control and is the heart of type enforcement. The user and role elements are intended for Role-Based Access Controls (RBAC) and are not extensively used in the default policy. There is a fourth security

117

attribute, called a sensitivity label or range, that can be optionally used but the default configuration of an installed SELinux system hides this attribute. The standard Linux commands ps (1) and ls (1) to display the DAC identity of processes and files have an additional optional parameter, Z, to display the security context. For example, the ps (1) command would produce the following output.

# ps -Z 2735 LABEL PID TTY STAT TIME COMMAND user _u:system _r:unconfined_t 2735 pts/0 Ss 0:00 bash

The ls command produces the following output:

# ls -Z sample.txt -rw-rw-r-- andy andy user _u: obj ect_r : user _home_t_sample .txt

From the previous output, in order for the bash shell to be able to access the sample . txt file in the home directory, there would have to be explicit rules allowing access from the unconfined_t domain to the user_home_t type. That's the simple part of type enforcement. However, if you consider labeling all of the objects on the system with a security context and assigning a security context to all processes and defining a separate set of access controls, the extent is daunting. Precompiled and packaged security policies provide an off-the-shelf solution to this and many packages for distributions contain policies for SELinux. Policy sources are available for those who want to compile their own custom policies or for those who want to actually see the components of policies to learn more. Originally policies were based on what is called the Example Policy. This policy source tree has some problems with modularity and has been replaced with the R eference Policy. All current policies are based on the Reference Policy which is available for download from oss.tresys.com/projects/refpolicy. Source policies for the targeted, strict and mls policies are also available in RPM format.

12.4 The Targeted Policy

There are several packaged policies available for SELinux. Some policies implement stricter controls than other policies. For example, the default policy installed with SELinux on Red Hat Enterprise Linux 4 is called the targeted policy because it targets Internet-facing daemons for strict mandatory access controls.

118

As illustrated in Figure 1 2-1, the targeted policy does this by assigning each daemon process to a specific domain, and precise rules are written to allow access to only what is needed for that daemon to perform its usual tasks. These controls are mandatory, which means that even if the daemon process executes as the root account, it only has access rights to what is explicitly defined in the policy. It also means that if a security flaw is exploited by someone attempting unauthorized access to the system, the daemon process will not be able to subvert the entire system as in the case of traditional DAC Linux security.

Figure 12-1 Targeted policy containment of three network services

Processes other than the targeted Internet-facing daemons execute in a single common domain called the unconfined domain. The MAC rules written for this domain permit any process executing in it to access any object on the system. The mandatory access controls are permissive to all accesses in this domain. This effectively results in processes executing in this domain to have the same access controls as in traditional Linux DAC. Any new files or other objects added to the system will have the appropriate security context applied to them. Any new processes will execute in the unconfined domain. Thus applications ported to a Linux system running SELinux should work as in standard Linux. On a system running latest SELinux (Red Hat Enterprise Linux 5), services are provided with targeted policies enabled by default.

12.5 Fixing Application Problems

In the ideal world everything works as planned. However, in the real world things may work most of the time, but contingencies are needed for those times when things do not work as expected. Although the type enforcement

119

of SELinux for the unconfined domain should not break an application that works with a traditional Linux DAC security policy, the fact is that it sometimes does. What are the alternatives? The simplest, but not the wisest, course of action is to disable SELinux. This means that the SELinux security module is not loaded into the kernel and not only are MAC access decisions not rendered for processes, but security contexts are not created for new files. As a result, files created while SELinux is disabled will have no security context. If SELinux is re-enabled at a later date, it will probably not work properly. Furthermore, SELinux provides a valuable level of protection which should not be easily discounted and disabled. What are the alternatives to disabling SELinux? You can place SELinux in permissive mode. In this mode, the SELinux security module is loaded into the kernel but will permit accesses that are normally denied by the security policy. It will still create new contexts for processes, files, and other objects but when access would be denied by policy, it will simply write audit records detailing the reason why the access would have been denied. Analysis of these records can uncover the reason for the failure and a possible tweak to the policy may permit the application to run with SELinux in enforcing mode.

12.6 Diagnosing Access Failures

SELinux makes access decisions whenever the Linux kernel makes access decisions. To render faster decisions, it maintains a cache of recent access decisions. Whenever an access is denied, an access vector cache (AVC) message is written either to the system log file, or to the audit trail if auditing is installed. A simple search for messages containing the string "AVC" will tell why the access was denied. AVC messages are ASCII strings and each message contains a time stamp, the command string, the process context, the context of the object being accessed, and reason for the denial. The system log file is typically in /var/log/messages; the audit log file is typically in /var/log/audit/audit.log. An important utility program to help diagnose policy problems is the audit2why (8) program. This program uses AVC messages as its input, and will output the cause of the access denial along with suggestions on how to correct the problem. Policy may be changed in several ways, but the most common ways are by disabling boolean settings or by writing a local policy module.

120

12.7 SE Linux Policy Booleans

SELinux has conditional policy rules that can be enabled or disabled based on the current values of a set of policy booleans. Policy booleans allow runtime modification of the security policy without having to load a new policy. Booleans were created to allow a configuration mechanism to enable or disable policy rules that provide a specific feature. For example, for the Apache web server, there is a policy boolean (httpd _enable _cgi) to enable common gateway interface (CGI) programs. CGI programs are a security risk but they are often needed to provide additional web functionality. The default targeted policy does not permit CGI programs to be executed, but if they are needed on the system, a boolean value can be turned on by an administrator to enable them. This change can be temporary until the system is rebooted or permanent so that each time the system is rebooted, the boolean will be set to the desired value. There are a set of booleans for each Internet-facing daemon that will enable specific functionality that is needed for the environment. Unless the application being ported to Linux uses the services of the Internetfacing daemons confined by the targeted policy, there may not be a boolean to enable specific functionality. However, SELinux also enforces certain memory protection operations that are not enforced in traditional Linux. There are appropriate boolean values to enable and disable these checks. For example, if a program writes to a memory- mapped file and makes the region executable, it will result in an execmod violation. A frequent cause for execmod violations is text relocation. This access is seen on files, usually dynamically shared objects (DSOs). Basically a DSO is loaded, and at some point the application determines that the code needs text relocation and uses the mprotect call to set the memory region to read/write. After the text relocation, the code is marked back to read/exec which triggers the access check. This problem was seen with Google's Picasa and Google had advised people to disable SELinux. The problem was in Picasa shared libraries, and changing the type of the library from lib_t to textrel _shlib_t allows Picasa to work with SELinux enabled. These protections were implemented to thwart code modification exploits but some programs may use techniques that are flagged by these controls.

121

Once it is determined that these controls are preventing an application from working, either the boolean controlling the operation can be changed or the program or library can be assigned to a type that permits these operations.

12.8 Adding a Local Policy Module

It is possible that a boolean value may not fix the access denial problem. Booleans are put into the policy to control common functionality but may not be the reason the application is failing. It may be necessary to actually add new rules to allow specific access that is needed by the application. An important utility program in SELinux that aids in this task is the audit2allow (1) program. This program takes AVC messages as input and will output the actual policy rules that will correct the access denial. These rules should be carefully analyzed to ensure that they permit exactly what the application needs to function properly and not a change to all SELinux denials on the system. The security principle of least privilege should not be ignored. Once the proper set of rules is determined, a local policy module can be generated as documented in the article Building SELinux Policy Modules (www.city-fan.org/tips/BuildSeLinuxPolicyModules).When this module is loaded, the rules will be activated and should permit the application to run successfully. The module will be loaded automatically whenever the system is rebooted or policy is reloaded. If the application does not run successfully, put SELinux in permissive mode and capture any AVC messages from the audit trail and use audit2allow(1) to generate additional rules for review and possible addition to the local policy.

12.9 Tools and Utilities

There are tools and utilities to help you manage SELinux. Depending on the distribution, some of these are available with a GUI and can easily be accessed. Others can only be accessed as command-line utilities. These include utilities to manage SELinux modes, security attributes, policy management, audit, and boolean values. Many of the manpage section 8 commands must be run by the root user with an appropriate role if applicable. Utilities to change the SELinux attributes of objects or set attributes of processes must be permitted by the policy or SELinux must be in permissive mode.

122

12.9.1 SELinux Modes

Table 1 2-1 lists command-line utilities to determine the status of the current SELinux policy, and the enforcement state of this policy. SELinux can be in one of three states: 1. 2. Enforcing: SELinux security policy is enforced. Permissive: SELinux logs warnings and violations instead of enforcing.

3. Disabled: SELinux is not loaded in the kernel.

Table 12-1 Utilities to Identify and Manage the Current Policy Enforcement Utility

sestatus (8)

Synopsis

Gets information about current SELinux policy, version, and modes Gets the current mode of SELinux Toggles enforcing/permissive SELinux modes

getenforce (8) setenforce (8)

You can use the setenforce(8) command to toggle enforcing and permissive modes. To disable SELinux, the Administration interface for configuring "Security Level and Firewall" must be used and the system rebooted. It is not advisable to disable SELinux since new security contexts will not be created for new objects, and enabling SELinux at a later time may cause problems.

12.9.2 SELinux Security Context

Table 1 2-2 lists command-line utilities that identify and modify the security context of SELinux subjects and objects, as well as to assume a new role and change the categories associated with a file. The -Z command line switch has also been implemented other Linux utilities.

123

Table 12-2 Utilities to Identify and Manage Security Context States Utility

newrole (1) chcon (1) runcon (1) chcat (1) restorecon(8) setfiles (8) Ls -Z

Synopsis

Runs a shell with a new role Changes the security context of a file Runs a command with the specified security context Changes file categories Sets file security contexts Sets file system security contexts Gets the security context of a file Gets the security context of a process Gets the default security context for a specified path

Ps -Z matchpathcon (8)

12.9.3 SELinux Security Policy Management

Security policy management is an essential task in SELinux. Occasionally, the security policy will need tuning to accommodate site-specific requirements. In order to correctly tune policy, it is useful to investigate the current policy to determine the current settings. The policy management utilities in Table 1 2-3 provide the ability to query policy as well as to generate and load a local policy module.

Table 12-3 Utilities to Compile, Check, Query, and Manage SELinux Policy Utility

semanage (8)

Synopsis

Policy management tool to add SELinux users, assign roles, and so forth SELinux policy query tool SELinux policy query tool SELinux policy compiler SELinux policy checking tool SELinux policy difference tool SELinux policy analysis tool Manage SELinux policy modules

sesearch (1) seinfo (1) checkpolicy (8) sechecker(1) sediff(1) apol(1) semodule (8)

124

12.9.4 Audit Utilities

The audit utility provides an important capability to trace the security relevant actions of processes. One of the basic requirements of any trusted system is to perform this function. SELinux adds other utilities that use the audit trail to generate policy rules to address process failures. The utilities shown in Table 1 2-4 provide access to this information.

Table 12-4 Audit Log Utilities to Generate Reports, Analysis, and SELinux

Policy

Utility

aureport (8) ausearch(8) seaudit_report (8) seaudit (8) audit2allow(1) audit2why(8)

Synopsis

Produces summary reports of audit daemon logs Queries audit daemon logs SELinux audit log reporting tool SELinux graphical audit log analysis tool Generates policy allow rules from logs of denied operations Translates SELinux audit denials into descriptions about why access was denied

12.9.5 SELinux Booleans

SELinux booleans are used to enable or disable conditional policy statements. The utility programs in Table 1 2-5 provide the ability to query and set the state of booleans.

Table 12-5 Utilities to Manage SELinux Control Booleans Utility

getsebool (8) setsebool (8) togglesebool (8)

Synopsis

Get SELinux boolean value(s) Set SELinux boolean value Flip the current value of a boolean

12.10 SELinux Policy Editors

SELinux policies are very large, containing tens of thousands of lines of statements and are very complicated. While it may be relatively easy to develop a small local policy module to enable a small number of accesses not permitted by one of the default policies, creating an extensive policy for a large application is challenging. Developing simple policy editing

125

tools is a priority for many in the development community and several are available with more in development. Hitachi Software originally developed seedit which uses a Simplified Policy Description Language. It uses simple name-based configuration and reduces the number of permissions required to get an application working. It also has a graphical interface to generate policies and contains other command line tools . Additional informatio n can be found at seedit.sourceforge.net. Another policy editor has been developed by Tresys Technology called SLIDE. This editor uses Eclipse, an open source development environment which uses Java, that must be loaded onto the system before the actual editor. It provides a graphical user interface for policy development, project creation wizards, and a variety of other policy development aids. It is integrated with the reference policy and makes it easy to compile and build module packages. Additional information can be found at oss.tresys.com/projects/slide.

12.11 SELinux API

SELinux does provide a set of library calls to access attributes on processes and files and several other operations, but most applications will not need to make use of them. SELinux is designed to minimize the need for using unique system calls so that special versions of application code would not be necessary. Most activity should be involved with policy development. However, a set of APIs are available in the libselinux.so library to get and set security attributes on files and processes, as well as to test whether access from a specific domain to a type would be permitted. Most of the time, these library calls are only needed by control or daemon processes that need to create a specific security context in which another process will run.

12.12 New features in SELinux

Note: These new features are introduced in Red Hat Enterprise Linux 5 SELinux has been significantly enhanced to provide out-of-the-box security in Red Hat Enterprise Linux 5 release. Building on the capabilities of earlier Red Hat releases, RHEL 5 includes number of enhancements in SELinux. Following are the brief overview of the new features in SELinux.

126

Support for MLS (Multi Level Security)

Security-Enhanced Linux (SELinux) now provides support for MLS (Multi Level Security) policies. This will enable Red Hat Enterprise Linux 5 to obtain US Government EAL4+/LSPP (Evaluation Assurance Level/Labeled Security Protection Profile) in addition to the existing EAL4+/CAPP certification. This means that Red Hat Enterprise Linux will offer the highest level of security clearance of any mainstream operating system.

Simplified Targeted Policy

All Red Hat Enterprise Linux 5 system services are provided with targeted policies, and these are enabled by default, resulting in the highest level of out-of-the-box security in the industry. Policy creation is simplified through the introduction of a reference policy and support for local policy modules, allowing ISVs and customers to create private policies.

Ease-of-use

Significant ease-of-use enhancements are provided with the inclusion of the SELinux Troubleshooter, a GUI-based analyzer that guides system administrators on appropriate actions to take in the event of a security alert.

Auditing

The Red Hat Enterprise Linux audit subsystem lets you track activities and modifications to the entire system­including file system operations, process system calls, user actions such as password changes, account additions/deletions/modification, use of authentication services, and configuration changes (such as time changes). This allows Red Hat Enterprise Linux 5 to meet US Government certifications such as CAPP/LSPP & NISPOM and also assist organizations to meet regulatory requirements such as Sarbanes Oxley & HIPPA. Audit now provides powerful searching and reporting tools, and is closely integrated with SELinux. It is the only auditing system incorporated into upstream community kernel. Audit also provides a feature that is unique in the industry­a real-time interface. This permits applications to analyze and react to events as they occur. A future update to the Audit capability will provide multi-system log aggregation.

127

12.13 Additional Information

The following Web sites containing useful information about SELinux: NSA - www.nsa.gov/selinux/ Fedora Project - fedora.redhat.com/About/Projects/selinux.html SELinux Distribution Integration - selinux.sourceforge.net/ SELinux Symposium with past papers - selinux-symposium.org/ Planet SELinux with developer's blogs and More selinuxnews.org/planet/ Red Hat Enterprise Linux 4 SELinux Documentation www.redhat.com/docs/manuals/enterprise/RHEL-4Manual/selinuxguide/ Gentoo SELinux - www.gentoo.org/proj/en/hardened/selinux/ Tresys Technology SELinux - www.tresys.com/selinux Building SELinux Policy Modules - www.cityfan.org/tips/BuildSeLinuxPolicyModules Red Hat resource center ­ See what's New. http://www.redhat.com/rhel/resource_center

Additional information about SELinux is also available at: SELinux by Example: Using Security Enhanced Linux, By Frank Mayer, Karl MacMillan, David Caplan; Prentice Hall

128

13 Porting Trusted Solaris Applications to Security Enhanced Linux

This chapter covers issues involved in porting code from Trusted Solaris to Security Enhanced Linux (SELinux). Introductory material to this chapter can be found in the white paper, "Legacy MLS/Trusted Systems and SELinux ­ concepts and comparisons to simplify migration and adoption" (http://h20195.www2.hp.com/V2/GetDocument.aspx?docname=4AA10827ENW&cc=us&lc=en). Background information on SELinux can be found in Chapter 12 of this porting guide. This chapter builds on the information in the white paper and Chapter 1 2 to discuss specific porting issues dealing with Trusted Solaris security features. Information in this chapter is based on working with Fedora Core 5, the first release containing SELinux with a Multilevel Security (MLS) policy. Readers of this chapter are assumed to have a good knowledge of Trusted Solaris security concepts and programming interfaces. More information on Trusted Solaris can be found in the "Trusted Solaris Developer's Guide (Sun Microsystems Part No. 816-1042-10) (http://docs.sun.com/app/docs/doc/805-8116-10?a=load).

13.1 Background

Trusted Solaris is the result of many years of development effort to produce a highly secure environment. Sun Microsystems implemented many features from the Department of Defense Trusted Computer System Evaluation Criteria and from the Defense Intelligence Agency Compartmented Mode Workstation (CMW) specification. SELinux is the result of a collaboration between the National Security Agency (NSA) and the open source community to produce a security environment that addresses many diverse computer security needs through implementing mandatory access controls, but not to specifically address the requirements for which Trusted Solaris was created. Furthermore, SELinux is a work in progress. The information in this chapter is based on the state of SELinux found in Fedora Core 5. Most of the important security features of the Trusted Solaris operating system have been implemented in SELinux but other security features, such as trusted networking and polyinstantiation are in process of being implemented. Still other features, such as a trusted window system,

129

are in the design and implementation phase. This chapter will cover interfaces in SELinux that are well established and likely to be found in current and near-future Linux releases. Unlike POSIX, there is no standard application programming interface (API) for accessing security features in these trusted systems. While Trusted Solaris and SELinux can both be certified to the Common Criteria Labeled Access Protection Profile that does not mean that they have a common way of implementing security features. There are no security related library calls common to both systems. All Trusted Solaris API calls will have to be translated into SELinux library calls or policy development work.

13.2 Terminology

Trusted Solaris and SELinux have many of the same security features, especially in the area of sensitivity labels. However, there are some differences in terminology that are worth noting. Trusted Solaris refers to the sensitivity label and clearance of the process as simply the process sensitivity label and process clearance. SELinux uses the terms effective sensitivity level and effective clearance when referring to these process attributes. Trusted Solaris stores these security attributes in two different structures but SELinux combines the effective sensitivity level and effective clearance into a single element in the security context called the range. The relationship between the process sensitivity level and process clearance is the same in Trusted Solaris and SELinux. The process sensitivity level is the basis for label access decisions. The process sensitivity level will be compared with the object sensitivity level to determine access. The process clearance is taken from the user account in Trusted Solaris or from the SELinux user, and represents the highest level of information to which the user is authorized on the system. Generally, most operations are done below the clearance at the process sensitivity level but programs may use privilege to access information up to the clearance. The term sensitivity label is used in the same way in both Trusted Solaris and SELinux. However, the components of the sensitivity label have some terminology differences. The hierarchical component of the sensitivity label is referred to as the hierarchical classification or level in Trusted Solaris. SELinux typically refers to this component as simply a sensitivity. The nonhierarchical components of the sensitivity label are usually referred to as compartments in Trusted Solaris while SELinux prefers the term category. There is complete consistency in the functionality but slight difference in the

130

language. Terms that are shared between the two systems are subjects and objects. In discussing security operations, a subject is the initiator of an operation (a process). An object is any entity that is acted upon such as a file, socket, pipe, directory, a process, and so forth.

13.3 Mandatory Access Policy

The mandatory access control (MAC) policy in Trusted Solaris is based solely on sensitivity labels; SELinux MAC policy is based on type enforcement and sensitivity labels. This fact has implications for porting code because in Trusted Solaris a MAC denial is attributable to a violation of the label policy; in SELinux a MAC denial could be due to either sensitivity label or type enforcement restrictions. SELinux performs a single MAC access check but it is useful to think of the access check as a type enforcement check and a sensitivity label access check. The security policy in Trusted Solaris is fixed and cannot be changed. The Bell-LaPadula rules for MAC label access control and privilege usage are fixed. Trusted programs use privileges to perform actions that would normally be prohibited by the security policy. Process privileges are replacements for the all-powerful root account and provide a more discrete approach to policy overrides. Roles and authorizations are security mechanisms to allow users override of security policy by invoking trusted programs. These programs use the API to query the user's roles and authorizations and use privilege to accomplish a task if the user is authorized. This mechanism provides administrative control enabling users to perform administrative and other trusted activities. In contrast, SELinux has a centralized security policy that controls all accesses on the system. Policy can be tailored to allow certain processes to perform actions that are denied to other processes. However, instead of granting privilege directly to the process, as in Trusted Solaris, the SELinux policy is modified to permit operations to be performed from certain domains. Only processes running in these domains have the ability to perform privileged operations. Furthermore, type enforcement places additional limits on the process. There is no API for privilege manipulation in SELinux. Policy is administratively defined to allow these operations external to the process. This means that some Trusted Solaris library calls have no counterpart in SELinux and will result in policy work instead of code porting effort.

131

In general, Trusted Solaris processes implement their own security policy through the use of privileges. In SELinux, policy is completely external to the process. This means that there will be fewer security specific API calls in programs in SELinux than in Trusted Solaris. However, Trusted Solaris program code development will be replaced by policy development in SELinux.

13.4 Sensitivity Labels

The basic label structure in Trusted Solaris is the CMW label, which contains the sensitivity label used for MAC enforcement, and the information label, which floats according to the level of information added to the object. Most program activity is centered on the sensitivity label. Library routines are used to get and set the CMW label on objects and subjects, and other library routines provide the functionality to extract or insert the sensitivity label into the CMW label structure. Note: Information labels are not supported in Trusted Solaris 7 and later releases. Trusted Solaris software interprets any information label on any object from systems running earlier releases as ADMIN_LOW. Subjects and objects still have CMW labels, and CMW labels still include the information label component; however, the information label component is fixed at ADMIN_LOW. Trusted Solaris sensitivity labels can be represented in three different formats: text, binary, and hexadecimal. Text values for labels consist of a sensitivity level with zero or more compartments such as "Secret Alpha Beta". Library routines exist to translate label representations from text values to binary structures and vice versa, but most label operations are done on binary label structures. Although the process clearance is externally the same format as a sensitivity label, Trusted Solaris differentiates between sensitivity labels and clearances as structures in the API. SELinux only supports a string representation for sensitivity labels, and the sensitivity label is contained in the range which is one attribute of the security context. The security context consists of four elements: user (differs from Linux UID) role type

132

range

All elements are strings separated by a colon. Therefore, the security context for an object or process is a string: user:role:type:range. The range consists of one or two sensitivity labels separated by a dash (-). Each sensitivity label consists of the traditional sensitivity plus zero or more categories in the format s1 : c0, c2, c5. A range of categories may be specified using a dot as c0 . c5 which means categories c0, c1, c2, c3, c4, and c5. The setrans.conf file can specify the translation of sensitivities and categories to other strings having more meaning. For example, s1 could be translated to Confidential. Label string translation is done automatically without API involvement if a translation is specified in the setrans.conf file. For SELinux processes, the range may be a single sensitivity label or a high and low label. The high label corresponds to the Trusted Solaris clearance and the low label corresponds to the effective sensitivity label of the process. Library routines exist to get and set the security context of subjects and objects. There are no binary or hexadecimal formats for the sensitivity label; there is simply a string consisting of sensitivities and categories. Since the range is part of the security context, library routines are provided to extract and insert the range part of the security context. Although string manipulation routines could be used for operations on the security context, library routines should be used to extract and insert the range into the security context. Programs should be coded to use provided APIs instead of being based on the present form of the security context. This will help to avoid problems if the format is changed in the future. Trusted Solaris has a variety of library calls to determine sensitivity label dominance or equality. It also has routines to determine a sensitivity label that contains the lowest classification and smallest set of compartments that dominates two given labels or the highest sensitivity label dominated by two sensitivity labels. SELinux does not have library routines comparable to these and simply treats the label as one component of the security context. Although SELinux has no exact comparable library routines, a combination of several library routines can provide similar functionality. Additionally, Trusted Solaris implements the Compartmented Mode Workstation Labeling: Encodings Format, DDS-2600-62 16-93, which rigidly specifies label format and enforces rules on the composition and minimal

133

requirements of valid sensitivity labels. SELinux does not implement this specification but policy can exercise some control on which categories can exist with which sensitivities. If strict adherence to the CMW Encodings Format is required, the SELinux translation demon can be replaced with a custom one that enforces this syntax.

13.5 Process Model

Trusted Solaris processes execute in the system environment with a set of security attributes that determine what actions the process may perform. In addition to the normal UNIX DAC attributes, the Trusted Solaris process also operates at a specific sensitivity level, has a clearance as previously described, and may also have privileges that are essentially security policy overrides. Privileges may be either inherited from the parent process or assigned from the forced privileges on the program executable. In either case, the allowed set of privileges on the program executable file is an absolute limit on privileges a process may use. A program is called a trusted program if it has the ability to use privilege. Since privilege allows policy overrides, trusted programs are actually implementing their own security policy. Trusted programs are expected to use the principle of least privilege by using library calls to lower privileges when not needed and raise them for actions that require privilege. SELinux processes execute with normal UNIX attributes and similar to the Trusted Solaris process, the SELinux process operates at an effective sensitivity level and has an effective clearance. However, SELinux processes execute in a specific domain. The concept of a domain does not exist in Trusted Solaris. The domain is defined by the type in the process's security context. Policy statements can assign capabilities and type attributes to the domain that provides the functionality of Trusted Solaris privileges. In SELinux, the domain rather than the process has the capabilities and attributes. This means that the privileged activity possible from the capabilities and attributes is confined to operations defined for the domain rather than system wide in scope. Furthermore, security policy, that is, what is permitted or denied, is centralized in policy statements instead of being implemented in discrete processes. The end result of these differences is that Trusted Solaris library routines to manipulate privileges will result in policy work rather than programming effort to port Trusted Solaris code to SELinux.

134

Trusted Solaris also uses process attribute flags to signal that certain processes have special characteristics, such as originating from the trusted path, performing privilege debugging, and so on. SELinux has no comparable feature.

13.6 Root Account Usage

Trusted Solaris and SELinux perform checks for privileged activity in different ways and this has implications for programmers. Since SELinux is a Linux Security Module (LSM), access checks are done after traditional Linux discretionary access checks. If the traditional Linux DAC check fails, then SELinux is not called. This means that in SELinux, many processes may need to run as the root user in order to pass Linux DAC checks while Trusted Solaris processes simply need to raise privileges. For example, suppose a process wants to bind to a privileged port (<1024). In Trusted Solaris, the process would simply need to raise the PRIV _NET _PRIVADDR privilege to accomplish this task. The process UID is not important. However, in SELinux, the process would have to be running as the root user in order to pass the Linux check that enforces only root user processes may bind to privileged ports. It would also need to have the CAP_NET_BIND_SERVICE capability assigned to its domain because SELinux does not base any access decisions on the UID. DAC checks are separate from MAC checks. Additionally, the domain would need type enforcement rules to permit basic socket operations. If the process were not running as root, the Linux check would fail and the SELinux LSM would not be called for the access check. In the Trusted Solaris kernel, root user checks have been replaced by privilege checks. In SELinux, root user checks are done in the traditional kernel and mandatory access checks are done in the SELinux LSM. SELinux does not replace root checks; it only further restricts root privilege with additional mandatory access checks. Trusted Solaris has a single integrated access check for both DAC and MAC; Linux with SELinux implements this as two separate checks. An alternative to SELinux processes running as root is to use the Linux API to manipulate capabilities (7). Capabilities are similar to Trusted Solaris privileges but only for DAC policy overrides. Starting with kernel 2.2, Linux divides the privileges traditionally associated with the superuser into distinct units, known as capabilities, which can be independently enabled and

135

disabled. As of Linux kernel 2.6.11 there are 31 different capabilities defined. The API for capabilities is implemented but capabilities cannot be assigned to the program executable to provide an initial set. However, one way to use capabilities is to create the program file as set effective UID to zero. Traditionally this was used to enable a program to override all DAC security checks. However, it also initializes the process capabilities with a single capability, CAP_SETPCAP. This capability allows the process to set any capability in the process permitted or effective capability sets. In this way, a set of capabilities can be initialized for the process that allows it to apply a least privilege approach to Linux DAC access overrides. When the UID is changed from zero, all capabilities are usually cleared. However, you can use the prctl (2) call to change this default behavior to retain capabilities when the UID is changed. Thus, Linux DAC security policy can either be addressed by running as EUID=0 or by using capabilities. In this chapter, whenever reference is made to a process needing to run as root or as EUID=0, an alternative is to write your code using capabilities. Note that using the library routines to set capabilities on the process only pertains to traditional Linux DAC checks. Capabilities must also be assigned to the process domain to pass SELinux MAC checks if required.

13.7 Role-Based Access Control (RBAC)

Trusted Solaris uses a combination of security features to implement a RoleBased Access Control (RBAC). Authorizations are assigned to user accounts indicating that the user is entitled to perform actions not permitted to normal users. Trusted programs check authorizations and use their privileges to accomplish the sanctioned tasks. Roles and rights profiles are expansions of this model. However, it is completely within the program itself to implement this role model. There is no overriding policy external to the program to effect it. Trusted Solaris API library calls give the program the ability to query the databases containing this information, but it is totally up to the program code to implement the policy. SELinux implements RBAC in a different manner. Roles are defined in policy statements and domains are assigned to roles in the security policy statements. The SELinux user is assigned roles appropriate to their responsibilities on the system. Policy, not the program, enforces which domains are permitted to the user. Again, this model results in policy work

136

instead of programming effort to implement role based access control. Therefore, permitted user activities are ultimately determined by the policy, not a process.

13.8 Auditing

Both Trusted Solaris and Linux provide an audit facility, and each provides a set of library calls to write security relevant information to the audit trail. However, the audit interfaces are different. Trusted Solaris provides the ability to write complete audit records or partial audit records and terminate the write with separate call containing the AW_END token. A single auditwrite() call is provided that uses tokens to identify the information passed to the routine. Privilege must be raised to write audit records. Audit records are written to the audit trail in binary. The Linux audit API consists of library calls to log various message types to the audit trail. Each call writes a complete audit record; there are no partial audit record writes. The process must be running as the root account to write to the audit trail and mus t be running in a domain having the CAP_AUDIT_WRITE capability. Audit records are written in text format and can be read by any process having the proper access credentials.

13.9 Polyinstantiation

Polyinstantiation, the replication of objects at different sensitivity levels, is needed in multilevel security environments. At a minimum, public directories need to be polyinstantiated to give processes running at different levels access. Trusted Solaris uses pathname adornments which are not shown to unprivileged processes, and it provides a set of library routines to determine and navigate directory hierarchies and files using their real pathnames. At the time of this writing, polyinstantiation is being implemented in SELinux but no APIs are present in Fedora Core 5.

13.10 Trusted Networking

Trusted networking allows communication between systems ensuring that the labeled protection enforced by the trusted operating system is extended to network communications. Unprivileged processes can communicate across the network but only at their sensitivity level. Trusted networking imposes accreditation checks on a per node and per interface basis to ensure that packets leaving the system are not sent to unauthorized

137

destinations. Communications between labeled systems pass at least the sensitivity label in the packet header for industry standard protocols like CIPSO and RIPSO. Full security attributes (UID, GID, sensitivity label, privileges) are passed when using the Trusted System Information eXchange (TSIX) protocol. Trusted Solaris contains an API for trusted networking called the TSIX API. The library routines allow processes to set and retrieve full security attributes of messages. SELinux uses xfrm and IPsec to communicate with other SELinux systems, and an implementation of CIPSO is being done to enable SELinux to communicate with legacy multilevel secure systems like Trusted Solaris. As of this writing, there are no user library routines to get or set security attributes of packets other than getpeercon () library call, which queries the security context of the remote process.

13.11 Trusted Windows

Trusted Solaris contained an implementation of trusted windows in which sensitivity labels were extended to window objects. Cut-and-paste operations were only permitted between windows of the same sensitivity level or with privilege between windows of different sensitivity levels. Library routines were provided to enable programs to query and set the sensitivity levels of window objects. SELinux does not yet extend the security context to window objects, and the X Window System does not work with a multilevel policy.

13.12 Porting Trusted Solaris Sensitivity Label Code to SELinux

One of the basic uses of the API in Trusted Solaris is to query and set sensitivity labels on subjects and objects. The basic structure containing the sensitivity label is the CMW label structure, which consists of the sensitivity label and the information label. Although the information label is no longer used, it is still present in the CMW label. A simple operation to display the sensitivity label of a file consists of three separate API calls: getcmwlabel() retrieves the CMW label from the file. getcsl() extracts the sensitivity label structure from the CMW label structure.

138

bsltos() converts the sensitivity label structure to a string.

The procedure to set the CMW label on a file is the same process but in reverse. stobsl() converts a string containing a sensitivity label to a sensitivity label structure. setcsl() inserts the sensitivity label structure into a CMW label structure.

setcmwlabel () puts the CMW label onto the file. You may need privileges for the above operations since MAC attributes are usually set by policy and changing or even converting label representations from text to binary involves operations not normally permitted to user processes. There are different procedures to get the sensitivity label of a link file (lgetcmwlabel()) and of an open file descriptor (fgetcmwlabel). Use the same basic procedure to get or set the sensitivity label of a process except the call to get is getpcmwlabel() and the call to set is setpcmwlabel(). Since all objects and subjects are labeled with the CMW label, the same extraction and conversion routines are used. However, a process has both a CMW label and a clearance. The clearance just contains a single structure externally the same as the sensitivity label but differentiated in the API. There is no need to extract or insert the clearance into another structure so the previous getcsl() and setcsl() routines are not needed. SELinux also has a set of routines to get and set the sensitivity label of a subject or object but the sensitivity label is part of the security context of the subject or object. A comparable procedure to the above to get or set the sensitivity label of a file is as follows:

getfilecon() retrieves the context string of a file

(user:role:type:range).

context_new() initializes a new security context structure from the

context string.

context_range_get() retrieves the range (sensitivity label(s)) from

the context structure. Similar to the Trusted Solaris routines above, the reverse is done to put a new sensitivity label on a file:

context_range_set() puts a range (sensitivity label(s)) into the 139

context structure.

context_str() puts the context structure into the context string. setfilecon() sets the context string onto the file.

As with the previous Trusted Solaris process, the SELinux process may need to execute as the root user (EUID=0), depending on the DAC of the file, and will need policy type attributes assigned to the domain in which the process executes. Also, SELinux has separate routines to get and set the security context of a link file (lgetfilecon()) and an open file descriptor (fgetfilecon()). One difference is that the sensitivity label is actually a range of a low and high sensitivity label separated by a dash. On files this does not have meaning but for processes the high sensitivity label is the effective process clearance. Similar to Trusted Solaris, SELinux has a separate set of routines for getting and setting the security context of a process. The getcon()and setcon() routines get and set, respectively, the security context of a process. There are no routines for process clearance since the sensitivity label is really a range and the high sensitivity label is the process clearance.

13.13 Porting Trusted Solaris Privileges to SELinux

Trusted Solaris processes execute with several privilege sets that are initialized from the inheritable privileges of the parent process and the forced and allowed privileges on the file containing the program executable code. Whenever a system call is made that requires privilege, only the effective privileges are checked to see if the call will be permitted. Thus, least privilege is implemented in the API by raising privileges when needed by putting them in the effective privilege set and removing privileges from the effective set when no longer needed. The allowed privileges on the program file executable established an absolute bound on the privileges that could be used by the process. There was extensive use of the API to manipulate these privileges by Trusted Solaris trusted programs in an effort to limit privileged operations. There are over 80 discrete privileges in Trusted Solaris to apply a very precise use of privilege in this manner. SELinux processes do not have privileges. The traditional Linux kernel has the security feature of capabilities as a root privilege replacement and these can be used in lieu of having a process execute as EUID=0. However, since

140

access decisions are independently made by the traditional Linux kernel and SELinux, processes must independently pass each access check. Processes execute in a domain, and SELinux policy can assign capabilities to the domain in which a process executes to provide the functionality of allowing processes to have some discrete privilege of the root account. Capabilities only provide override to the DAC policy. For overrides to MAC rules, a different mechanism is used. SELinux policy contains a set of constraint rules that express general rules on how operations involving MAC operations are enforced. For example, the policy statements in Example 1 31 specify rules governing directory searches.

Example 13-1 Policy Constraints mlsconstrain dir search

(( l1 dom l2 ) or

(( t1 == mlsfilereadtoclr ) and ( h1 dom l2 )) or (t1 == mlsfileread ) or ( t2 == mlstrustedobject ));

This rule specifies that directory searches are constrained to the situation where the sensitivity label of the process (l1) dominates the sensitivity label (l2) of the directory. However, there are several or conditions specifying exceptions if certain type attributes are present on the domain of the process or the type of the directory. The first one states that the search will be successful if the effective clearance (h1) of the process dominates the sensitivity level (l2) of the directory and the domain (t1) of the process has the mls file read to the clearance (mlsfilereadtoclr) attribute. Another or condition is if the domain of the process (t1) has the mls file read (mlsfileread) attribute or if the type of the directory (t2) has the attribute mls trusted object (mlstrustedobject) assigned to it. These constraint attributes are defined in the policy and used to express the basic MAC rules. Attributes are assigned to domains and types in policy definition to make use of the or conditions. Thus, in Trusted Solaris, a process could read the contents of a directory which had a sensitivity label that strictly dominated the process sensitivity label by raising the priv _file _mac _search privilege. In SELinux, a process could read the directory contents if the domain of the process had the mlsfileread type attribute assigned to it or the mlsfilereadtoclr type attribute assigned if the effective clearance of

141

the process dominated the sensitivity level of the directory. The Trusted Solaris process requires privilege and usually API code to manage the privilege, but the SELinux process requires policy work. No API code is needed.

13.14 Roles and Authorizations

The third major security feature in Trusted Solaris is the use of roles and authorizations. Unlike sensitivity labels and privileges, there is really no kernel code that enforces roles and authorizations. Implementation of roles is done in user and administrative level programs that use the API to query user attributes (roles and authorizations) and use privileges to accomplish a task. The security policy governing the use of a role is dependent on the logic of the code that acts in a particular way depending on the presence of the role or authorization. The use of roles and authorizations was extensively developed in Trusted Solaris because it avoided giving privilege directly to the user because program code controlled the use of privilege. Roles are implemented differently in SELinux. The role is an important part of the security context and it determines whether a process that executes in a particular domain is permitted to run. The enforcement of the role is not part of the program code but a part of the SELinux enforcement policy. Domains are assigned to roles and if the current role does not contain a particular domain, the process that runs in the domain may not be run by the user. No API is used.

13.15 Putting it all Together

Porting a program from Trusted Solaris to SELinux involves not only converting library and system calls to a new API but also requires work in an unfamiliar area: SELinux policy development. Several other aspects of the way in which SELinux works will require changes to the process environment. Therefore, tracing the process of porting a program that performs a simple task from Trusted Solaris to SELinux can illustrate these tasks. The example chosen to illustrate some of the tasks of porting code from Trusted Solaris to SELinux is the common task of reading a sensitivity label of a file, changing the sensitivity label, updating the sensitivity label on the file, and then rereading the label. The process and the file initially have the

142

same sensitivity label (unclassified) and the process will change the sensitivity label to confidential, a level that dominates the label of the process. Since sensitivity labels are part of the mandatory policy, nonprivileged processes are not permitted to change labels. The file will have a different owner than the effective UID of the process but it will have read access to the file. Therefore, there needs to be override of the discretionary access control owner policy as well as the mandatory access control policy. Trusted Solaris privileges are needed to enable the process to change labels and in SELinux, the process will need to execute with an EUID=0 and in a domain that permits these actions. An example of a Trusted Solaris program to change the label of a file is shown in Example 1 3-2. The program begins by reading the CMW label of the file (getcmwlabel()), extracting the sensitivity label (getcsl()) and changing the label to an external string representation (bsltos()) and printing it. In order to put a new label on the file, the reverse needs to be done. A string representation of the new label is changed into an internal sensitivity label structure (stobsl()), inserted into the CMW label structure (setcsl()) and set on the file (setcmwlabel()). However, before the label can be set, privileges must be raised to enable the process to perform the tasks. The following privileges are needed by the program:

PRIV _FILE _DAC _WRITE to be able to change the label of a file

13.15.1 Trusted Solaris Example Program

owned by some other user.

PRIV _FILE _UPGRADE to be able to change a file label to

a level that dominates the level of the process.

PRIV _SY S_T RAN S_ LA BEL to be able to co nvert the

text representation of a label to its binary representation since the process is running at a lower level than the new label.

PRIV _FILE _MAC _READ to be able to reread the label of

the file since the new file label dominates the level of the process. These privileges are added to the effective privilege set (set_effective_priv PRIV_ON) before the calls to create the new label and removed (set_effective _priv PRIV _OFF) after the new label is read. The program compile statement and program output is shown in example 13-3.

143

Example 13-2 Trusted Solaris Program to change file sensitivity label

#include <tsol/label.h> #include <tsol/priv.h> #include <stdio.h> main() { int retval, err, length = 0; bclabel_t fileCMWlabel, fcl; bslabel _t fsenslabel, newsenslabel,fs1; char *string = (char *)0, *string1 = (char *)0, *str; if (set _effective_priv (PRIV _SET, 0) == -1) perror("Cannot clear effective privileges"); /* Get file CMW label */ retval = getcmwlabel ("/app/foobar", &fileCMWlabel); if (retval < 0) perror ("getcmwlabel1"); /* Get SL portion of CMW file label */ getcsl (&fsenslabel, &fileCMWlabel); /* Translate file SL and print */ retval = bsltos(&fsenslabel, &string, length, LONG_CLASSIFICATION); if (retval < 0) perror ("bsltos1") /* Set the SL to CONFIDENTIAL */ /* Raise privileges */ if (set _effective_priv (PRIV_ON, 4, PRIV _FILE _DAC _WRITE, \ PRIV _FILE _MAC _READ, \ PRIV _FILE _UPGRADE, \ PRIV _SYS _TRANS _LABEL) == -1) perror("Cannot set effective privileges"); retval = stobsl("CONFIDENTIAL", &newsenslabel, NEW_LABEL, &err); if (retval < 0) perror ("Stobsl"); setcsl (&fileCMWlabel, &newsenslabel); retval = setcmwlabel ("/app/foobar, &fileCMWlabel, SETCL_SL); if (retval < 0) perror ("setcmwlabel1"); /* Reread the file CMW label */ retval = getcmwlabel("/app/foobar, &fcl); if (retval < 0) perror ("getcmwlabel2"); /* Get the SL portion */ getcsl (&fsl, &fcl); /* Convert file SL and print */ retval = bsltos (&fsl, &string1, length, LONG_CLASSIFICATION); if (retval < 0) perror ("bsltos2"); printf ("NEW file SL = %s\n", string1); /* Clear privileges */ if (set _effective_priv (PRIV _OFF, 4, PRIV _FILE _DAC _WRITE, \ PRIV _FILE _MAC _READ, \ PRIV _FILE _UPGRADE, \ PRIV _SYS _TRANS _LABEL) == -1) perror("Cannot clear effective privileges"); }

144

Example 13-3 Compile statement and program output

#cc setfilelabel.c -ltsol -o setfilelabel # ./setfilelabel File Sensitivity label = Unclassified NEW File Sensitivity label = Confidential #

SELinux Example Program The code for the SELinux example program to perform the same task as the Trusted Solaris program is shown in Example 1 3-4. The code is simpler since sensitivity labels are all string values, but there is policy work that is required for the part of the Trusted Solaris program that uses privileges. Furthermore, since the program is using shared libraries and writing output to the terminal, additional type enforcement policy statements are needed. Additionally, the process will need to execute with EUID=0 because changing the attribute of an object that is not owned by the EUID of the process requires an override of the DAC policy. In SELinux, all DAC policy is checked before the SELinux security module is called and if the DAC policy check fails, the operation fails. SELinux cannot grant privilege to a process which is denied by traditional DAC. The program begins by reading the security string of the file (getfilecon()). This context contains four security attributes separated by colon characters. A simple string manipulation call could be made to isolate the sensitivity label or range however, in order to be independent of the underlying raw structure of the security context, library routines are used to create a new context structure (context_new ()) and use a library call (context_range_get ()) to extract the sensitivity label and print it. The reverse is done to set the label by inserting the new label string into the context structure (context_range_set ()), assign the structure to a context string (context_str ()) and set the context on the file (setfilecon ()). There are no privilege operations since this is policy work and all policy work is done outside of the program. The program compile statement and program output is shown in Example 1 3-5.

145

Example 13-4 SELinux Program to change the range (sensitivity label) of a file

# include <stdio.h> #include <selinux/selinux.h> #include <selinux/context .h> main() { int retval; security _context _t secconstr, con; context_t seconstrct, sec1; /* Get file context */ retval=getfilecon("/app/foobar", &secconstr); /* Convert the security_context_t to a context_t */ seconstrct=context_new (secconstr); /* Print the sensitivity label or range of the file */ printf ("File sensitivity label is %s\n ", context_range_get(seconstrct)); /* Assign new Sensitivity label */ retval=context _range _set (seconstrct, "Confidential"); if (retval < 0) perror ("context_range_set"); secconstr=context_str (seconstrct); retval=setfilecon ("/app/foobar", secconstr); if (retval < 0) perror ("setfilecon"); retval=getfilecon("/app/foobar", &con); if (retval < 0) perror ("getfilecon"); sec1=context _new (con); printf("NEW file sensitivity label is %s\n", context _range _get(sec1)); }

Example 13-5 Program compile statement and program output

# cc setfilelabel.c -lselinux -o setfilelabel # ./setfilelabel File sensitivity label is SystemLow NEW file sensitivity label is Confidential

A local policy module will need to be created and loaded into the kernel. The policy for this program is shown in Example 1 3-6. The procedure for creating local policy modules is documented in the following web page www.city-fan.org/tips/BuildSeLinuxPolicyModule. The following is a brief outline of what has to be done: 1) Create a new domain for the process, for example chglab_t. 2) Create a new type for the executable file, for example chglab_exec_t. 3) Create transition rules to transition to domain chglab_t when a file of type chglab_exec_t is executed.

146

4) Add type enforcement rules to allow the process access to the file. 5) Add type enforcement rules to allow access to shared libraries. 6) Add type enforcement rules to allow the process access to the tty device. 7) Assign the fowner Linux capability to the chglab _t domain. 8) Assign the following constraint attributes to the chglab _t domain

mlsfilereadtoclr to be able to reread the file label after it is

changed.

can _change _object _identity to be able to change a

security attribute if the SELinux user differs from the SELinux user of the process.

mlsfileupgrade to be able to change the label of a file to a

value that dominates the label of the process. Several other policy statements relating to type enforcement need to be added. Use the audit2allow(1) SELinux utility to list those elements.

147

Example 13-6 Local policy module to support program to change file sensitivity label

policy_module (localmisc, 0.1.5) require { type user_t; type user_tty_device_t; type user_devpts_t; attribute mlsfilereadtoclr; }; # Create new type and domain for program file and process type chglab_t; type chglab_exec_t; domain _type (chglab_t) # Assign the new domain to the user role role user _r types chglab_t; # Define transition to the new domain domain_entry_file (chglab_t, chglab_exec_t) domain_auto_trans (user_t, chglab_exec_t, chglab_t) # Assign type attributes and capabilities to the domain domain_obj _id_change_exemption (chglab_t) typeattribute chglab _t mlsfilereadtoclr; mls _file _upgrade (chglab_t) allow chglab_t self: capability { fowner }; # Rules for terminal communication allow chglab _t user _tty_device _t:chr _file { read write getattr ioctl}; allow chglab _t user _devpts _t:chr _file { getattr read write }; domain_use_interactive_fds (chglab_t) # Need access to shared libraries libs_use_ld_so (chglab_t) libs _use _shared _libs (chglab_t) # Type enforcement rules for file access allow chglab _t user_t: file { read getattr relabelfrom relabelto }; allow chglab _t user_t: dir search; allow chglab _t user_t:process sigchld; fs_associate (user_t)

13.15.2 Comparing Trusted Solaris and SELinux Components

Although the code for accessing security attributes of files is different, each program must acquire privilege to be able to change a MAC attribute. Table 1 3-1 provides a comparison of the security features relevant in the two different programs.

148

Table 13-1 Example program security feature comparison

Security Features Process ID Sensitivity Trusted Solaris Does not matter Level Initial level of the file Does not matter SELinux EUID=0 Initial level of the file

Clearance

New label of the file or higher mlsfileupgrade attribute assigned to domain mlsfilereadtoclr attribute assigned to domain fowner capability assigned to the domain Not applicable

Ability to change a label

PRIV_FILE_UPGRADE privilege PRIV_FILE_MAC_READ privilege

Ability to read a label of a higher level file

Ability to change attributes of PRIV_FILE_DAC_WRITE a file owned by another user privilege Ability to translate a label not PRIV_SYS_TRANS_LABEL dominated by process SL privilege Ability to change an attribute of a file with a different SELinux user attribute Not applicable

can_change_object_ident ity attribute assigned to the domain

13.15.3 Summary

This simple code illustrates several important concepts for porting Trusted Solaris code: There are some parallels in API routines for manipulating sensitivity labels. Code which raises and lowers privileges will be removed from the program but will require policy development work. Type enforcement policy work will probably be required for new domains created for Trusted Solaris programs.

149

14 Virtualization

14.1 HP ICE-Linux

HP Insight Control Environment for Linux (ICE-Linux) provides comprehensive discovery, imaging, deployment, monitoring, and management for Linuxbased HP ProLiant server platforms. Built on the industry-leading HP Systems Insight Manager (HP SIM), this solution integrates open source technology with experience leveraged from Linux products in HP Insight Control Environment for Linux and HP XC Clusters. It also provides a clear path to the future with developing technologies.

14.1.1 Why use HP Insight Control Environment for Linux?

Integrated Linux management on industry-standard ProLiant servers Architected to manage expanding Linux multi-server ProLiant environments to meet the need for increased productivity, utilization, and control Leverage of the best open source and HP management technologies using the established HP Systems Insight Manager (SIM) for Linux platforms Use of established open source technologies (Nagios and others) for extensible systems health and performance monitoring Enablement of workgroup and departmental clustering Global support by HP

HP ICE-Linux brings the full expertise of HP management investments from UNIX and Windows to the Linux environment, while providing flexibility and productivity to fulfill the variety of usual management tasks.

14.1.2 Features and benefits

HP Insight Control Environment for Linux provides productivity, control, and a path to future innovations. Comprehensive management for Linux productivity

150

HP ICE-Linux is designed to productively manage growing Linux environments on industry standard servers. Couple bare metal system discovery and Linux deployment with system image capture and deployment to get systems up and running quickly. Use Blade servers, Rack servers, or Virtual Connect with a variety of flexible network arrangements. Deploy from one to many servers without complex preparation. Choose HP-supported Linux distributions (RHEL, SLES) or choose your own Linux solution using HP-enabled capabilities. Manage clusters to empower your team, workgroup or department. Auto-configuration of Nagios for quick user productivity while preserving access to scripts and plug-ins. For more information about HP Insight Control Environment for Linux (HP ICELinux) or for a free trial, please contact your local HP representative or visit http://www.hp.com/go/ice-linux ICE-Linux benefits: http://h18004.www1.hp.com/products/servers/management/insightcontr ol_linux2/benefits.html

14.2 Virtualization on Red Hat Enterprise Linux 5

Red Hat Enterprise Linux 5 is the first product to deliver commercial quality open source virtualization. Experience the power of virtualized computing on your server and client systems. Virtualization technology delivers a quantum step in IT operational flexibility, speed of deployment, and application performance and availability. Virtualization allows IT managers to deliver more to their customers while gaining control of their costs. And new and exciting uses for virtualized environments are being developed every day. With Red Hat virtualization, processing and data resources - servers and storage - are logically grouped into a single resource pool. Virtual servers and storage can then be dynamically allocated, in just a few seconds, as business demands dictate. Red Hat Enterprise Linux provides support for two types of virtualized guests, para-virtualized and fully virtualized. Para-virtualized guests offer the highest performance and do not require special processor capabilities. Para-virtualization is available

151

for Red Hat Enterprise Linux 4 (Update 5) and Red Hat Enterprise Linux 5 guests. Full virtualization is available for a wider range of guests, including Red Hat Enterprise Linux 3, 4 and 5, and also for third party operating system guests. Full virtualization requires Intel VT or AMD-V processors. Virtualization is provided in all Server products and is optionally available for Client products Storage and extended server virtualization are provided with Red Hat Enterprise Linux Advanced Platform Red Hat Network supports virtualized guest operating systems virt-manager, libvirt/virsh management tools

14.3 New Virtualization Features in Red Hat Enterprise Linux 5.1

Red Hat Enterprise Linux 5.1, released in November 2007, included a number of exciting new virtualization features, including: Live Migration of fully virtualized guests. Save/Restore of fully virtualized guests. Para-virtualized device drivers for use with fully virtualized Red Hat Enterprise Linux 3, 4, and 5 guests. These provide significant performance gains. Full support of virtualization with Intel Itanium2 systems. The ability to run 32-bit para-virtualized guests on x86-64 systems running in 64-bit mode. Improved graphical management tools.

One of the most compelling features that debuted in Red Hat Enterprise Linux 5 was fully integrated support for virtualization. A huge amount of work went into the integration, all aimed at providing customers with a simple and consistent configuration and operation experience. Virtualization remains a substantial emphasis for Red Hat's development team, continuing the improvements in Red Hat Enterprise Linux 5.3. Since the initial delivery of Red Hat Enterprise Linux 5 in March 2007, the

152

growing scalability of x86-64-based hardware has provided the motivation to support increasingly large virtualization platforms. In addition to enhancements to the core hypervisor (hosting) layer, Red Hat Enterprise Linux has also received improved guest capabilities. These enhancements provide benefits for customers who wish to deploy a few large guest instances, and also those who wish to deploy numerous smaller guests. Both deployment styles are valid, as customers are increasingly using virtualization to lower total cost of ownership (TCO) by increasing system management flexibility, for example by enabling migration of workloads based on growth needs, and for high availability, such as via guest instance failover and migration for planned maintenance. Examples of the scalability enhancements in Red Hat Enterprise Linux 5.3,1 include: Support for up to physical 126 CPUs and 32 CPUs per virtual server Support for up to 1TB of memory per server and 80GB per virtual server Support for more than 16 disks per guest Support for more than four network adapters per guest Improved paravirtualization has been an important factor in driving the demand for increased scalability. In traditional fully virtualized environments, application workloads that had high levels of network and disk IO could incur up to 30 percent performance overhead compared to bare-metal deployments. Paravirtualization provides device drivers that cleanly plug into older Red Hat Enterprise Linux releases, which operate as virtualized guests in Red Hat Enterprise Linux 5. These paravirtualization device drivers are able to utilize enhanced hardware capabilities to bypass the majority of the virtualization overhead, resulting in minimal performance degradation. This allows IOintensive applications to become candidates for virtualization. Red Hat Enterprise Linux 5.3 includes enhancements to the previously existing paravirtualization drivers as well as optimizations such as utilizing large 2MB page tables. These paravirtualization enhancements enable deployment of applications such as database and messaging workloads in virtualized environments, thereby driving the need for increasingly larger hardware configurations.

153

Another new virtualization feature in Red Hat Enterprise Linux 5.3 is libvirtcim. Since the initial release of Red Hat Enterprise Linux 5, Red Hat has provided an abstraction layer called libvirt as the system management interface to virtualization. Libvirt is designed to hide the differences and changes in low-level virtualization implementations and changing interfaces from the system management tools. The libvirt library has proven popular and has been increasingly adopted and utilized by a variety of system management tool vendors. This benefits customer, who can choose the tools with which they are familiar and which best meet their use cases. The vibrant community that has rallied around libvirt has truly been a win-win. Meanwhile, numerous commercial system management frameworks utilize an architecture called Common Information Model (CIM) as the interface to interact with managed services. Combining the features of both management standards, Red Hat Enterprise Linux 5.3 introduces libvirt-cim, widening the set of virtualization configuration and operational management capabilities to include CIM compliant interfaces. For more information on virtualization strategy on Red Hat: http://www.redhat.com/virtualization-strategy/

154

15 Additional Resources

Additional information about Linux solutions at HP Linux @ HP: www.hp.com/go/linux Eclipse: www.hp.com/go/eclipse

Tools and further information on migrating to Linux HP DSPP Program: www.hp.com/dspp HP IT Resource Center: www.itrc.hp.com HP Caliper performance analyzer: www.hp.com/go/caliper The Red Hat Solaris to Linux porting guide: www.redhat.com/docs/wp/solaris port/book1.html Linux Device Driver development: www.oreilly.com/catalog/linuxdrive2 Linux Standard Base: www.linuxbase.org The Linux Documentation Project: www.tldp.org UNIX Rosetta Stone (a UNIX/Linux command map): bhami.com/rosetta.html HP DSPP Linux resources: www.hp.com/go/linuxdev HP TOC calculator and other links: http://www.hp.com/wwsolutions/linux/resourcecentre.html HP & Open Source: opensource.hp.com HP Linux Technologies Project: hpl.hp.com/research/linux

155

A C Compiler Option Comparisons

The following table maps the Sun Studio 12 C compiler options to the corresponding GNU Compiler Collection (GCC) version 4.1.2 C compiler and the Intel C 9.1 compiler options where applicable. For more information about these compilers, visit the following Web sites: GCC: http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/ Intel: download.intel.com/support/performancetools/c/linux/v9/copts_cls.p df Sun Studio 12 Collection: http://docs.sun.com/app/docs/coll/771.8?l=en

Sun C Option

-#

GCC Option

-v

Intel C Option Description

-v

Shows the invocations of all components. Add -Wl, -v with GCC to see the linker invocation. Similar to -#, but the stages are not actually executed. Associate name as a predicate with the specified token as if by the #assert preprocessing directive. Specifies that bindings of libraries for linking are dynamic (shared). Specifies that bindings of libraries for linking are static (nonshared). Note: -static forces all libraries to be linked statically. Prevents preprocessor from removing comments. Directs the compiler to compile, but not link. Defines preprocessor macro. Specifies dynamic or static linking. Dynamic is the default GCC and Intel behavior.

-###

-###

-dryrun

-Aname[ (token)]

-A name=token

-Aname[ (token)]

-Bdynamic

-Wl, -Bdynamic

-Bdynamic

-Bstatic

-Wl,Bstatic

-Bstatic

-C

-C

-C

-c

-c

-c

-Dname[ =val]

-d[ y|n]

-Dname[ =val]

-Dname[ =val]

Default behavior

-static

Default behavior

-stati

156

Sun C Option

-dalign

GCC Option

-malign-double

Intel C Option Description

-malign-double

Assumes at most 8­byte alignment.

or

-E -E -Zp8 -E

Runs only the preprocessor on source files. Prefixes string error to beginning of error messages. Limits warnings from header files Disables printing specific warnings based on their tag number. No direct equivalent in GCC, but -w inhibits all warnings completely. Refer to the gcc (1) manpage for the types of warnings that can be suppressed. Controls amount of detail in error message. For Intel, -Wbrief prints a one line diagnostic. Shows error message tags. Treats all warnings as errors. Enables options to maximize execution speed. No direct equivalent exists for GCC. See

-ffast-math , -fstrict-

-errfmt[ =[ no%] error]

No equivalent No equivalent

-w

No equivalent No equivalent

-wdt

-errhdr

-erroff[=t]

-errshort [=short | full | tags]

No equivalent

-Wbrief

-errtags[ =yes | no] -errwarn -fast

No equivalent

-Werror

No equivalent

-Werror -fast

No equivalent

, and equivalence.

aliasing -fd -Wstrictprototypes --help

-march

for partial

No equivalent

-help

Warns about K&R-style function declarations and definitions. Prints a summary of the available compiler options. For GCC, a brief summary is listed for all available compilers. Enables geration of floating point, fused and multiply add instructions.

-flags

-fma[=none|fused]

-mhard-float mfused-madd

-fma

157

Sun C Option

-fnonstd

GCC Option

No equivalent

Intel C Option Description

No equivalent Macro for and Causes nonstandard initialization of floatingpoint arithmetic hardware. Also causes hardware traps to be enabled for floating-point overflow, division by zero, and invalid operations exceptions.

-fns -ftrap=common.

-fns

No equivalent No equivalent No equivalent

No equivalent

Turns on nonstandard floating-point mode. Sets floating point rounding precision mode. Selects the rounding mode. The default IEEE mode is round-tonearest. This is also the case on Linux, but GCC does not have an option to select a different mode. Use the ISO C99 interface fesetround to select the appropriate rounding mode. For Intel, the -fp-model option controls rounding modes and compiler optimizations that affect FP results. Allows the optimizer to make simplifying assumptions about floating-point arithmetic. The compiler evaluates float expressions using single precision instead of double precision. Forces conversion of floating-point expressions instead of leaving them in the register. For Intel, rounds intermediate results to sourcedefined precision. Turns on trapping for the specified floating-point conditions. On Linux, use the portable ISO C99 interfaces.

-fprecision=p

-pc32 -pc64 -pc80 -fp-model mode

-fround= mode

-fsimple=[ n]

See

math

-ffast-

Default

-fsingle

No equivalent

No equivalent

-[ no] fstore

-ffloat-store

-fp-model source

-ftrap=mode

No equivalent

No equivalent

158

Sun C Option

-G

GCC Option

-shared

Intel C Option Description

-shared -fpic

The linker creates a shared object instead of an executable. Intel compiler requires -fpic on IA-32 and EMT64 platforms Generates debugging information. Prints the path name of each of the header files being compiled. Assigns a name to the generated shared object. Adds an include search directory. Any directories you specify with -I options before the -I- option are searched only for the case of #include "file"; they are not searched for #include <file>. Passes option to linker to ignore LD_LIBRARY_PATH setting. Retains temporary files. Emits position-independent code. Emits position-independent code for use in shared libraries. Adds dir to library search directory. Links with library

.so) libname.a

-g -H

-g -H

-g -H

-h name

-Wl, --soname,

nam

No equivalent

-I[ dir]

-I[ dir] -I-

-I[ dir] -I-

No equivalent

-i

No equivalent

No equivalent No equivalent

-fPIC -fpic

-keeptmp -KPIC -Kpic

-save-temps -fPIC

No equivalent

-Ldir

-Ldir

-Ldir

-lname

-lname

-lname

(or

-mc

No equivalent No equivalent No equivalent No equivalent

No equivalent No equivalent No equivalent No equivalent

Removes duplicate strings from .comment section. Assumes at most one-byte alignment. Assumes at most two-byte alignment. Removes all strings from t h e .comment section, optionally inserts

string.

-misalign

-misalign2

-mr[ , string]

159

Sun C Option

-mt

GCC Option

-pthread

Intel C Option Description

-pthread

Passes -D _REENTRANT to the preprocessor and adds the threads library to the link line. Directs the compiler to generate code for the current system architecture. No direct equivalent for GCC or Intel; however, the switches listed do allow you to specify a specific system architecture. Turns on the default optimization level ( -xO2 for Solaris, -O1 for GCC, -O2 for Intel). Specifies the output file. Compiler preprocesses only the corresponding C files. No direct equivalent for GCC. Use - E instead. Prepares object code to collect data for profiling with prof (1). Emits identification information to output file. Prepares object code to collect data for profiling with prof (1). Passes list of directories used to specify library search directories at run time. Directs the compiler to produce an assembly source file. Removes all symbolic debugging information from the output file. For Intel, use the Linux strip utility on the executable. Undefines preprocessor symbol

name.

-native

See and

-march=cputype

-b machine

or

-mcpu=cputype

-march=cputype

-O

-O

-O

-o file -P

-o file

-o file -P

No equivalent

-p

-p

-p

-Q[ y |n]

No equivalent

-p

No equivalent

-p

-qp

-Rdirlist

-Wl, -rpath

dirlist

-Wl,-rpath dirlist

-S

-S

-S

-s

-s

No equivalent

-Uname

-Uname

-Uname

-V

-v

-v

Prints information about version of each tool as the compiler executes.

160

Sun C Option

-v

GCC Option

-Wall

Intel C Option Description

-Wall

or

-strict-ansi

Compiler performs more semantic checks and enables lint-like tests. You can achieve this with GCC by using the -Wall option and other -W options that are not included in

-Wall.

-Wc, arg

-Wc, arg

-Qoption, c,arg

Tells the compiler to pass arg as an argument to the tool named by c. Suppresses warnings. Selects various degrees of compliance with ISO C. Optimizes for the 386 processor.

-w -X[ a | c | s | t]

-w

-w

See Table 4-1 in the Compilers chapter

-mcpu=i386

-x386

See Table 4-1 in the Compilers chapter No equivalent

or

-march=i 386 -x486 -mcpu=i486

No equivalent

or

-march=i 486 -xa -fprofile-arcs -prof-genx

Similar to -x386, but for the i486 processor.

Inserts code for basic block coverage analysis using tcov. For GCC, use the GNU gcov(1) test tool to process the output. For Intel, use the codecov utility. Specifies what assumptions can be made to perform optimizations using type-based alias analysis.

-xalias_level[ =l]

-fstrictaliasing -fargument[no]alias -fargumentnoalias-global

-falias -fno-alias -ffnalias -fno-fnalias -alias-args -ansi-alias

-xarch=name

-march=name

-march=name

Selects a specific instruction set to be used. Turns on automatic parallelization. Improves optimization of code that calls standard library functions. Accepts C++ style comments.

-xautopar -xbuiltin

No equivalent

-parallel

Default behavior, see -f[ see -fnobuiltin Default

no] builtin

-xCC

Default

161

Sun C Option

-xc99[=o]

GCC Option

-std={ c99|c89}

Intel C Option Description

-std={ c99|c89}

Controls recognition of features from C99 standard. Defines cache properties for the optimizer. SPARC specific optimization macros. Specifies whether type signed or unsigned.

char

-xcache[ =c]

No equivalent

Set by options such as -Xn and

-Xb

-xcg[ 89| 92]

No equivalent

-f[ no] unsigned _char

No equivalent

-f[ no] unsigned _

-xchar[ =o]

cha

is

-xchar _byte _order [=o]

No equivalent

No equivalent

Produces integer constant by placing characters of multicharacter constant in specified byte order. Turns on run-time checking for stack overflow. Intel has -fstacksecurity-check to detect buffer overruns. Selects the target processor.

-xcheck[ =o]

-fstack-check

No equivalent

-xchip[ =name]

-march=cputype

-march=cputype

or

-axprocessor

or

-tpn -xcode[ =v] -xcrossfile

No equivalent No equivalent No equivalent

No equivalent

-ipo

Specifies code address space. Enable optimization and inlining across source files. Accepts source code modules that do not conform to ISO C source character code requirements. Generates debugging info using stabs or dwarf standard format. Analyzes loops for interiteration data dependencies. No direct equivalent for GCC, but -floopoptimize performs loop optimizations. Show driver tool commands but do not execute tools. Performs syntax checks only.

-xcsi

No equivalent

-xdebugformat

No equivalent Partial equivalence see -floopoptimize

No equivalent

-O3

-xdepend

-xdryrun

-###

-dryrun

-xe

-fsyntax-only

-fsyntax-only

162

Sun C Option

-xexplicitpar

GCC Option

No equivalent

Intel C Option Description

No equivalent Generates parallelized code based on specification of #pragma MP directives.

-xF

No equivalent

Use profile guided Enables optimal reordering of optimization functions and variables by the linker. For Intel, the proforder utility uses the profile guided optimization data to create function order optimizations.

-help

-xhelp=flags

--help -v

Displays summary of compiler options. Enables compiler support for hardware counter-based profiling. Enables/disables incremental linker. Does not try to inline any functions. Attempts to inline all functions. Inlines only the functions specified in the list. Performs whole-program optimizations by invoking an interprocedural analysis component. Causes instrumentation of multi-thread application for analysis. Sets the number of processes the compiler creates to complete its work. Changes the default linker scoping for the definition of extern symbols. Math routines return IEEE 754 style return values for exceptional cases. Inlines some math library routines for faster execution. Links in the Sun supplied performance libraries.

-xhwcprof

No equivalent No equivalent

No equivalent No equivalent

-xild[ on | off]

-xinline -xinline=%auto -xinline=fct,

-fno-inline -finlinefunctions ...

-fno-inline -finlinefunctions

No equivalent No equivalent

No equivalent

-ipo

-xipo[=a]

xinstrument[=[no]datarac e] -xjobs=n

No equivalent

No equivalent

No equivalent

No equivalent

-xldscope={ v }

No equivalent

No equivalent

-xlibmieee

No equivalent No equivalent No equivalent

No equivalent No equivalent No equivalent

-xlibmil

-xlic_lib=sunperf

163

Sun C Option

-xlicinfo

GCC Option

--version

Intel C Option Description

No equivalent

-par-report

Returns information about the license file used. Shows which loops are parallelized. Generates makefile dependencies. Generates makefile dependencies excluding system header files. Generates makefile dependencies including compilation. Specifies a filename for makefile dependency output. Same as but excludes files in GCC excludes files from system header directories.

-M, /usr/include.

-xloopinfo

No equivalent

-M -MMD

-xM -xMMD

-M -MMD

-xMD

-MD

-MD

-xMF -xM1

-MF -MM

-MF -MM

-xMerge

No equivalent No equivalent No equivalent

No equivalent No equivalent No equivalent

Merges data segments into text segments. Limits the level of pragma the level specified.

opt

-xmaxopt[=v]

to

-xmemalign= ab

Specifies maximum assumed memory alignment and behavior of misaligned data. Allows shared libraries to interface with code written in Java. Does not automatically link in any libraries. Enable/disables the use of optimized math routines. Prevents inlining math functions. This is the default for GCC and can be enforced with -fno-fast-math. Prevents inclusion of runtime search path for shared libraries. Selects the optimization level. Enables explicit parallelization with OpenMP directives.

-xnativeconnect

No equivalent

-nostdlib

No equivalent

-nostdlib

-xnolib

-xlibmopt | -xnolibmopt

No equivalent Default behavior

No equivalent

-nolib-inline

-xnolibmil

-xnorunpath

Default behaviour Default behaviour

-xO[ n] -xopenmp[=i]

-O[ n]

-O[ n] -openmp

No equivalent

164

Sun C Option

-xP

GCC Option

-aux-info

Intel C Option Description

See

-Wmissingprototypes

filename

or

protoize -n

Prints prototypes for K&R function definitions. GCC prints prototypes for all functions. Use the protoize tool with the -n option to see which prototypes should be added to a program. Intel provides -Wmissing-prototypes warnings. Sets page size in memory for the stack and/or heap. Automatically parallelizes loops, but also lets you specify what to do. Activates the precompiled header feature. Precompiled header support is available for GCC 3.4 and above. Specifies the last include file for the precompiled header file created by

-xpch.

-xpagesize [_heap |_stack] =n -xparallel

Target-specific No equivalent Default

No equivalent

-parallel

-xpch=v

-pch

-xpchstop=file

No equivalent

No equivalent

-xpg

-pg

-pg

Prepares the code to collect data for profiling with gprof (1). Enables prefetch instructions on architectures that support it. Generate indirect prefetches for the loops. Controls aggressiveness of automatic insertion of prefetch instructions. Instructs the compiler to generate code for collecting profiling information. Uses the information from various runs of a program compiled with

-xpro fi l e=co ll ect.

-xprefetch

No equivalent No equivalent No equivalent

-prefetch

-xprefetch_auto_type

-prefetch

-xprefetch_level= l

No equivalent

-xprofile=collect

-fprofilegenerate

-prof-gen

-xprofile=use

-fprofile-use

-prof-use

-xprofile=tcov

-fprofile-arcs

-prof-genx

The program will emit information that can be examined using the tcov tool. For GCC, use gcov. For Intel, use codecov.

165

Sun C Option

-xprofile_ircache

GCC Option

No equivalent No equivalent No equivalent

-ffixed-< reg >

Intel C Option Description

No equivalent

-prof-dir

Use with -xprofile=use to improve compile time. Use with -xprofile=use to help the compiler find the profile data. Turns on reduction recognition for automatic parallelization. Specifies the usage of registers for the generated code. With GCC you can use the -ffixed-<reg>, -fcall-used-<reg>, and fcall-saved-<reg> options to specify the use of certain registers. Treats pointer-valued function parameters as restricted pointers. The GCC and Intel -fargumentnoalias option is the same as -xrestrict=%all. Use fargument-noalias-global to tell the compiler that parameters are not even aliasing global data. Use the ISO C99 keyword restrict to mark function parameters that are not aliasing other values. For Intel, use the -restrict option with the restrict keyword.

-xprofile_pathmap

-xreduction

Enabled by

-parallel

-xregs=r[ ,r. . .]

or

-fcall-used<reg>

o

No equivalent

-fcall-saved<reg> -xrestrict[ =f]

-fargumentnoalias

-fargumentnoalia o

or

-fargumentnoalias-global

-restrict

-xs

No equivalent No equivalent No equivalent

No equivalent No equivalent No equivalent

Copies all debug information into the executable. Allows the compiler to generate speculative loads. Generates additional symbol table information for the source code browser. Similar to -xsb but does not create an object file.

-xsafe=mem

-xsb

-xsbfast

No equivalent

No equivalent

166

Sun C Option

-xs fpconst

GCC Option

Partial equivalence, see -fshortdouble

Intel C Option Description

No equivalent Unsuffixed floating-point variables are treated as float instead of the default double. GCC provides the -fshort-double option which uses the same size for double as for float. This is not the same but if double values are not used in the compilation unit, it is close enough. Performs only optimizations that do not increase the code size. String literals are inserted into the read-only data section. This is the default with GCC. To get the default behavior of the Sun compiler, use the -fwritablestrings option (deprecated for GCC and Intel). Specifies the target platform for the instruction set and optimization.

-xspace

-Os

-O1

-xstrconst

Default behavior

Default behavior

-xtarget= name

Target specific (see -m options)

-mcpu=cputype

or

-mtune=process or

-xtemp=dir

Use TMPDIR environment variable

Use TMPDIR environment variable

Sets directory for temporary files. W i t h GC C a n d I n t e l , s e t t h e environment variable TMPDIR to the name of the directory you want to use. Controls implementation of thread local variables. Reports the time and resources used for the compilation. Issues warnings about differences between K&R and ISO C. GCC warns in general about ISO C rule violations. To not be warned, you must use the -traditional option to explicitly allow K&R behavior. Determines whether the compiler recognizes trigraph sequences as defined by the ISO C standard. Instructs the compiler to unroll loops n times.

-xthreadvar[ =o]

-ftls-f t l s model=mode

model=mode

l

-xtime -ftime-report

No equivalent No equivalent

-xtransition

No equivalent

-xtrigraphs

-trigraphs

No equivalent

-xunroll= n - - p a r a m maxunroll-times=n

-unroll[n]

167

Sun C Option

-xustr

GCC Option

No equivalent No equivalent

Intel C Option Description

No equivalent Various options Converts string literals to UTF-1 6 strings. Enables automatic generation of calls to the vector library functions. For Intel, various options enable this depending on the processor type selected for code generation, i.e. -xW, -xT, etc For use with the VIS instruction set (SPARC V9 instruction set extension). Warns about loops that have #pragma MP directives when the loop may not be properly specified for parallelization. Specifies that the component c of the compiler can be found in directory dir. For GCC, the -B option specifies the directory but not the component. Changes the default directory searched for compiler components. Changes the default directory searched for include files. For GCC and Intel, the -Bdir option may be used to search a directory before searching the standard directories. Changes the default directory for finding library files. For GCC and Intel, the -Bdir option may be used to search a directory before searching the standard directories. Changes the default directory for startup object files. For GCC and Intel, the -Bdir option may be used to search a directory before searching the standard directories. Creates a database for the Solaris lock _lint tool.

-xvector

-xvis

No equivalent

No equivalent

-xvpara

No equivalent

-par-report

-Yc,

dir

-Bdir

-Qlocation,

c,di

-YA,

dir

-Bdir

-Bdir

-YI,

dir

See -Bdir

See -Bdir

-YP,

dir

See -Bdir

See -Bdir

-YS,

dir

No equivalent, see -Bdir

No equivalent, see -Bdir

-Zll

No equivalent

No equivalent

168

B C++ Compiler Option Comparisons

The following table maps the Sun Studio 12 C++ compiler options to the corresponding GNU Compiler Collection (GCC) version 4.1.2 G++ and Intel C++ 9.1 compiler options where applicable. For more information about these compilers, visit the following Web sites: GCC: http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/ Intel: download.intel.com/support/performancetools/c/linux/v9/copts_cls.p df Sun Studio 12 Collection: http://docs.sun.com/app/docs/coll/771.8?l=en

Sun C++ Option G++ Option Intel C++ Option Description

-386 -486 -a -mcpu=i386 -mcpu=i486 -fprofile-arcs -static

No equivalent No equivalent

-prof-genx -Bstatic

Sets the target processor to i386. Sets the target processor to i486. Generates code for profiling. Specifies whether a library binding for linking is symbolic, dynamic (shared), or static (nonshared). Directs the compiler to compile, but not link. Sets the target environment. On Solaris, use -xcg{ 89| 92}. Sets the major release compatibility mode of the compiler. Prevents the compiler from expanding C++ inline functions. Defines a macro symbol name. Specifies dynamic or static linking. Dynamic is the default G++ and Intel behavior.

-Bbinding

or

-shared

o

or

-Bdynamic

-symbolic -c -c -c

-cg{ 89| 92}

No equivalent

-fabi-version

No equivalent

-fabi-version

-compat[ ={ 4 | 5}]

+d

-fno-inline

-fno-inline

-Dname[ =def]

-d[ y|n]

-Dname[ =def]

-Dname[ =def]

Default behavior Default behavior

-static -static

169

Sun C++ Option G++ Option Intel C++ Option Description

-dalign -malign-double -malign-double

or

-Zp

Assumes at most 8-byte alignment. G++ default alignments are not equivalent to SPARC alignments. Directs the compiler driver to show its compiler subcommands but not execute these commands.

-dryrun

-###

-

d r y r u n

-E

-E

-E

Directs the compiler driver to only preprocess the C++ source files, and to send the result to standard output (stdout). Controls the virtual table generation in the Solaris compatibility mode 4. Disables printing specific warnings based on their tag number. No direct equivalent in G++, but -w inhibits all warnings completely. Refer to the g++ (1) documentation for info on types of warnings that can be suppressed. Limits warnings from header files Shows error message tags. Treats all warnings as errors. Selects a combination of compilation options for optimum execution speed on the system upon which the code is being compiled. Enables/disables various C++ language features. Refer to the "Options Controlling C++ Dialect" section in the GCC manual for more information. Suppress the filtering that the compiler normally applies to linker error messages. Displays a brief description of each compiler flag.

+e{ 0 | 1}

No equivalent

No equivalent

-erroff[=t[,t. . .]

-w

-wdt

-errhdr -errtags[ =yes | no] -errwarn -fast

No equivalent No equivalent

-Werror -O3

No equivalent No equivalent

-Werror -fast

-features= a

No equivalent

No equivalent

-filt [=filter [,filter...]] -flags

No equivalent

No equivalent

--help

-help

170

Sun C++ Option G++ Option Intel C++ Option Description

-fma[=none|fused] -mhard-float mfused-madd -fma

Enables geration of floating point, fused and multiply add instructions. Causes hardware traps to be enabled for floating-point overflow, division by zero, and invalid operations exceptions. Selects SPARC nonstandard floating-point mode. Refer to the g++(1) manpage

-funsafe-math-optimizations

-fnonstd

No equivalent

No equivalent

-fns[ ={ no | yes}]

No equivalent

No equivalent

option.

-fprecision= a

No equivalent See

-frounding-math

-pc32 p c 6 4 -pc80 -fp-model mode

Sets floating-point rounding precision mode. Sets the IEEE rounding mode in effect at startup. For Intel, the -fpmodel option controls rounding modes and compiler optimizations that affect FP results. Selects floating-point optimization preferences. Refer to the -ffastmath option in the gcc (1) manpage. Forces precision of floating-point expressions. The default G++ behavior is to disable this. Refer to the i386, x86-64, and IA-64 options in the gcc(1) manpage. For Intel, rounds intermediate results to sourcedefined precision. Sets the IEEE trapping mode in effect at startup. Instructs the linker to build a dynamic shared library instead of an executable file. Refer to the ld (1) manpage and the C++ User's Guide for more information. Intel compiler requires -fpic on IA-32 and EMT64 platforms

-fround= mode

-fsimple[=n]

See

-ffast-math

Default

-[ no] fstore

-ffloat-store

-fp-model

sourc

-ftrap=a [,a ...]

No equivalent

-shared

No equivalent

-shared -fpic

-G

171

Sun C++ Option G++ Option Intel C++ Option Description

-g -g -g

Instructs both the compiler and the linker to pre pare the file or program for debugging. Instructs the compiler to prepare the file or program for debugging, but not to disable inlining. On the standard error ( stderr) output, prints the pathname of each #include file contained in the current compilation, one per line. Assigns the name to the generated shared dynamic library. Prints (on the standard output) a description of the command-line options. Adds pathname to the list of directories that are searched for #include files with relative file names (those that do not begin with a slash). Any directories you specify with -I options before the -Ioption are searched only for the case of #include "file"; they are not searched for #include <file>. Tells the linker, ld(1), to ignore any LD_LIBRARY_PATH setting. Does not try to inline any functions. C on tr ol s the pl ac emen t an d linkage of template instances.For G++, the -frepo option provides a method to use a repository for templates.

-g0

-glevel

No equivalent

-H

-H

-H

-h[] name

-help

-Wl, --soname,

name

-help

No equivalent

-help

-Ipathname

-Ipathname

-Ipathname

-I-

-I-

No equivalent

-i

No equivalent

-fno-inline

No equivalent

-fno-inlinefunction

-inline

-instances= a

See

-frepo

No equivalent

172

Sun C++ Option G++ Option Intel C++ Option Description

-instlib= file

No equivalent

No equivalent

Use this option to inhibit the generation of template instances that are duplicated in a library, either static or shared, and the current object. Retains the temporary files that are created during compilation. If supported for the target machine, emits positionindependent code. Specifies code address space. Same as -xcode=pic13. Adds a path to the library search paths. Adds library

lib.

-keeptmp

-save-temps

No equivalent

-fPIC

-KPIC

-fPIC

-Kpic

No equivalent

-Lpath

-fpic

-Lpath

-Lpath

-llib

-libmieee

-llib

-llib

No equivalent

No equivalent

Causes libm to return IEEE 754 values for math routines in exceptional cases. Inlines selected math library routines for optimization. Incorporates specified CCprovided libraries into compilation and linking. Removes duplicate strings from the .comment section. Provides information on migrating source code that was built for earlier versions of the compiler. Permits misaligned data, which would otherwise generate an error in memory. Removes all strings from the .comment section. Compiles and links multithreaded code. Passes _D_REENTRANT to the preprocessor.

-[ no] libmil

No equivalent No equivalent

No equivalent No equivalent

-library=lib [,lib...]

-mc

No equivalent No equivalent

No equivalent No equivalent

-migration

-misalign

No equivalent

No equivalent

-mr[ , string]

No equivalent

-pthread

No equivalent

-pthread

-mt

173

Sun C++ Option G++ Option Intel C++ Option Description

-native

See -b machine or and

-march=cputype

-march=cputype

-mcpu=cputype

Directs the compiler to generate code for the current system architecture. No direct equivalent for GCC or Intel, however, the switches listed do allow you to specify a specific system architecture. Disables C++ exceptions. Refer to the "Options Controlling C++ Dialect" section in the GCC manual for more information. Does not use the standard system libraries when linking. Disables license queuing. Does not build the path for shared libraries into the executable. Default optimization level. Note that the GCC default is - O 0 , while the default for Sun and Intel is -O2 Specifies optimization level. Specifies output file. Ignores nonstandard preprocessor asserts. Compiler preprocesses only the corresponding C files. No direct equivalent for G++. Use -E instead. Prepares the object code to collect data for profiling with prof (1). Specify target CPU. Generates extra code to write profile information suitable for the analysis program gprof (1).

-noex

No equivalent

No equivalent

-nolib

-nodefaultlibs

-nodefaultlibs

or

-nostdlib -noqueue -norunpath

No equivalent No equivalent

-O2

No equivalent No equivalent

-O

-O

-O[level] -o file +p

-O[level] -o file

-O[level] -o file

No equivalent No equivalent

No equivalent

-P

-P

-p

-p

-p

-pentium -pg

-mcpu=pentium -pg

-mcpu=pentium [2|3|4] -p

174

Sun C++ Option G++ Option Intel C++ Option Description

-PIC -fPIC -fPIC

If supported for the target machine, emits positionindependent code. If supported for the target machine, emit positionindependent code. On Solaris, same as

-template=wholeclass.

-pic

-fPIC

-fpic

-pta

No equivalent No equivalent No equivalent No equivalent No equivalent

No equivalent No equivalent No equivalent No equivalent See -earlytemplate-check

-ptipath

Specifies an additional search directory for template source. On Solaris, same as

-instances=static.

-pto

-ptrpath

This option is obsolete and is ignored by the compiler. Displays each phase o f instantiation as it occurs. For Intel, -early-templatecheck providesa method to check template function prototypes prior to instantiation. Passes option to the specified compilation phase. No direct equivalent for G++, but -Wl passes options to the linker. Generates profiling code. Causes the compiler driver to produce output of the type specified by source type. For G++ and Intel, use -c to produce an object file and -S to produce an assembler file. For Intel, use -P to produce a preprocessed file. Builds dynamic library search paths into the executable file. Same as -xhelp=readme.

-ptv

-Qoption phase

option

See -Wl,

option -Qoption, phase,option

-qp -Qproduce

source type

-p -c

-p -c

or

-S

or

-S

or

-P

-Rpath[ :path...]

-Wl, rpath dirlist

-Wl,

-rpath

dirlist

-readme

No equivalent

No equivalent

175

Sun C++ Option G++ Option Intel C++ Option Description

-S -S -S

Compiles and generates only assembly code. Strips the symbol table from the executable file. Produces information for the source code browser. Produces only source browser information, no compilation. Indicates which C++ libraries specified by the -library , the -xlang , and the -xia options (including defaults) are to be linked statically. Defines the directory for temporary files. With GCC and Intel, set the environment variable TMPDIR to the name of the directory you want to use. Enables/disables various template options. Causes the compiler driver to report execution times for the various compilation passes. Deletes initial definition of the preprocessor symbol name. Enables loop unrolling. Displays compiler version. Verbose. This flag is available only for a compatibility mode on Solaris. Controls compiler verbosity.

-s

-s

No equivalent No equivalent No equivalent No equivalent

-sb

No equivalent No equivalent

-sbfast

No equivalent staticlib=l[,l.. .]

-temp=path

Use TMPDIR environmen t variable

Use TMPDIR environment variable

-template=a[,a...] No equivalent -time -ftime-report

No equivalent No equivalent

-Uname

-Uname

-Uname

-unroll=n

-V -v -vdelx

--param max-unrolltimes=n --version -v

-unroll[n]

-v -v

No equivalent

No equivalent See -earlytemplate-check

-verbose=a[,a. ..] -v

176

Sun C++ Option G++ Option Intel C++ Option Description

+w -Wall -Wall

Identifies code that might have unintended consequences. Refer to the g++ (1) manpage for additional warning options. Emits the same warnings as +w as well as warnings about technical violations that are probably harmless, but that might reduce the maximum portability of your program. Refer to the g++ (1) manpage for additional warning options. Suppresses warning messages. Inserts code for basic block coverage analysis using tcov. For GCC, use the GNU gcov(1) test tool to process the output. For Intel, use codecov. Specifies what assumptions can be made to perform optimizations using type-based alias analysis.

+w2

-Wall

-Wall

-w -xa

-w -fprofile-arcs

-w -prof-genx

-xalias_level[ =n]

fstrictaliasing fargument[no]alias -fargumentnoalias-global

-falias -fno-alias -ffnalias -fno-fnalias -alias-args -ansi-alias

-xar

No equivalent No equivalent

No equivalent

-march=name

Creates archive libraries (for templates). Specifies the target architecture instruction set. Enables or disables better optimization of standard library calls. (SPARC platforms) Defines the cache properties for use by the optimizer. Sets the target environment.

-xarch=value

or

-mcpu=cputype

-xbuiltin[ ={ %all | %none}]

See

-fnononansibuiltins

See

-f[ no] builtin

-xcache= c

No equivalent

Set by flags such as -Xn and -Xb No equivalent

-xcg89

No equivalent

177

Sun C++ Option G++ Option Intel C++ Option Description

-xcg92 -xchar= o

No equivalent

-f[no]unsignedchar

o

No equivalent

-f[no]unsigned- char

Sets the target environment. The option is provided solely for t h e p u r p o s e o f e a s i n g t h e migration of code from systems where the char type is defined as unsigned. Enables a run-time check for stack overflow. For Intel, the -fpstkchk option enables FP stack checking after every function/procedure call. Specifies the target processor for use by the optimizer.

-fsigned-char

-xcheck[ =n]

-fstack-check

- f s t a c k - securitycheck -fpstkchk

-xchip= c

-march=cputype -march=cputype

or

-axprocessor

or

-mcpu=cputype -xcode=a -xcrossfile[ =n]

No equivalent No equivalent Partial equivalence see -floopoptimize

No equivalent

-ipo

Specifies code address space. Enables optimization and inlining across source files. Analyzes loops for interiteration data dependencies. No direct equivalent for GCC, but floop-optimize performs loop optimizations. Use this option when you want to see how macros are behaving in your program. Checks only for syntax and semantic errors. The -xF option enables the optimal reordering of functions and variables by the linker. Displays a brief description of each compiler flag. Displays the contents of the online README file. Links the appropriate interval arithmetic libraries and set a suitable floating-point environment. Turns off the incremental linker.

-xdepend

-O3

-xdumpmacros [=val[,val...]

No equivalent

No equivalent

]

-xe -fsyntax-only -fsyntax-only

-xF

No equivalent

Use profile guided optimization

-help

-xhelp=flags

--help

-xhelp=readme

No equivalent No equivalent

No equivalent No equivalent

-xia

-xildoff

No equivalent

No equivalent

178

Sun C++ Option G++ Option Intel C++ Option Description

-xildon -xinline

No equivalent

-fno-inline

No equivalent

-fno-inline

Turns on the incremental linker. Does not try to inline any functions. Attempts to inline all functions. Specifies which user-written routines can be inlined by the optimizer at -xO3 or higher. Performs interprocedural optimizations(IPO). Eanbles the compiler to perform IPO using object files compiled with ­xipo and resides in the archive library(.a). Causes instrumentation of multi-thread application for analysis. Compiles with multiple processors. Includes the appropriate Fortran run-time libraries and ensures the proper run-time environment for the specified language. Changes the default linker scoping f or t he d efini ti on of e x t e r n symbols. Causes libm to return IEEE 754 values for math routines in exceptional cases. Inlines selected math library routines for optimization. Uses a library of optimized math routines. Links in the Sun Performance Library. Shows license server information. Performs link-time optimizations on relocatable object files.

-xinline=%auto

-xinline [=func[ , func. ..]]

finlinefunction s

-finlinefunctions

No equivalent

No equivalent

-xipo[ ={ 0 | 1 | 2}]

No equivalent No equivalent

-ipo

-xipo_archive

-ipo

xinstrument[=[no]datar ace] -xjobs=n -xlang=lang[ , lang]

No equivalent

No equivalent

No equivalent No equivalent

No equivalent No equivalent

-xldscope={ v }

No equivalent

No equivalent

-xlibmieee

No equivalent

No equivalent

-xlibmil

No equivalent No equivalent No equivalent

--version

No equivalent No equivalent No equivalent No equivalent No equivalent

-x[ no] libmopt

-xlic_lib=sunperf

-xlicinfo -xlinkopt[ =level]

No equivalent

179

Sun C++ Option G++ Option Intel C++ Option

-Xm -fdollarsinidentifiers

Description

Accepts the $ (dollar sign) character in identifiers. This is the default behavior for both GCC and Intel. Runs only the preprocessor on the named C++ programs, requesting that it generate makefile dependencies and send the result to the standard output (see make(1) for details about makefiles and dependencies). Generates makefile dependencies excluding system header files. Generates makefile dependencies including compilation. Specifies a filename for makefile dependency output. This option is the same as -xM, except that this option does not report dependencies for the/usr/include header files, and it does not report dependencies for compilersupplied header files. Merges the data segment with the text segment. Specifies the maximum assumed memory alignment and the behavior of misaligned data accesses. Allows you to compile for the Native Connect Tool (NCT). Disables linking with default system libraries. Enable/disables the use of optimized math routines. Prevents inlining math functions. This is the default for GCC and can be enforced with

-fno-fast-math.

Default

(Default)

-xM -M -M

-xMMD

-MMD

-MMD

-xMD

-MD

-MD

-xMF

-MF

-MF

-xM1

-MM

-MM

-xMerge

No equivalent No equivalent

No equivalent No equivalent

-xmemalign[ =ab]

-xnativeconnect[= n ]

No equivalent

-nostdlib

No equivalent

-nostdlib

-xnolib

-xlibmopt | xnolibmopt -xnolibmil

No equivalent Default behavior

No equivalent

-nol ib-inl ine

180

Sun C++ Option G++ Option Intel C++ Option

-xnorunpath

Description

Prevents inclusion of runtime search path for shared libraries. Specifies optimization level ( n). The G++ optimization levels are 0..3 and s. Use the -xopenmp option to enable explicit parallelization with OpenMP directives. Sets the preferred page size for the stack and the heap. Activates precompiled header feature. Precompiled header support is available for GCC 3.4 and above. Specifies the last include file for the precompiled header file created by -xpch. Compiles for profiling with the gprof (1) profiler. Use this option to help you port code to a 64-bit environment. Enables and adjusts prefetch instructions on those architectures that support prefetch. Generate indirect prefetches for the loops. Controls the automatic insertion of pref etch instructions as determined with -xprefetch=auto. Instructs the compiler to generate code for collecting profiling information. Uses the information from various runs of a program compiled with

-xpro fi l e=co ll ect.

Default behaviour Default behaviour

-xOn

-O[level]

-O[level]

-xopenmp[=i]

No equivalent

-openmp

-xpagesize [_heap |_stack] =n -xpch=v

Target-specific Default

No equivalent

-pch

-xpchstop=file

No equivalent

No equivalent

-xpg

-pg

-pg

-xport64[=v]

No equivalent No equivalent

-Wp64

-xprefetch[ =a[ , a]]

-prefetch

-xprefetch_auto_type

No equivalent

-prefetch

-xprefetch_level= l

No equivalent

No equivalent

-xprofile=collect

-fprofilegenerat

-prof-gen

-xprofile=use

-fprofile-use

-prof-use

181

Sun C++ Option G++ Option Intel C++ Option

-xprofile=tcov -fprofile-arcs -prof-genx

Description

The program will emit information which then can be examined using the tcov tool. For GCC, use gcov. For Intel, use codecov. Use -xprofile_ircache[ =path] with -xprofile=collect | use to improve compilation timeduring t h e u s e p h a s e b y r e u s i n g compilation data saved from the collect phase. Use with -xprofile=use to help the compiler find the profile data. Specifies the usage of registers for the generated code. With G++, you can use the options

-ffixed-<reg>, -

-xprofile_ircache [=path]

No equivalent

No equivalent

-xprofile_pathmap

No equivalent

-ffixed-< reg >

-prof-dir

-xregs=r[ , r. ..]

or

-fcall-used<reg>

o

No equivalent

-fcall-saved<reg>

and to specify the use of certain registers.

fcall-used-<reg>, fcall-saved-<reg>

-xs

No equivalent No equivalent

No equivalent No equivalent

Allows debugging by without object files.

dbx(1)

-xsafe=mem

Allows the compiler to assume that no memory protection violations occur. Produces information for the source code browser. Produces only source browser information, no compilation. Does not allow optimizations that increase code size. Specifies the target platform for t h e i n s t r u c t i o n s e t a n d optimization. Refer to the g++ (1) manpage. Works in conjunction with the __thread declaration specifier to take advantage of the compiler's thread-local storage facility.

-xsb

No equivalent No equivalent No equivalent Target specific, see m options

No equivalent No equivalent -O1

-mcpu=cputype

-xsbfast

-xspace

-xtarget= name

-

or

-mtune=processor

-xthreadvar[ =o]

-ftls--model=model f t l s model=mode

l

182

Sun C++ Option G++ Option Intel C++ Option

-xtime -ftime-report

Description

Causes the compiler driver to report execution times for the various compilation passes. Enables or disables recognition of trigraph sequences as defined by the ISO/ANSI C standard. GCC disables by default. Enables unrolling of loops where possible. Support of UTF-16 strings. Enables generation of calls to vector library functions. Use the -xvis=[ yes | no] command when you are using the assembly language templates defined in the VISTM instruction set Software Developers Kit (VSDK). Converts all warnings to errors by returning a nonzero exit status. Link editor option.

No equivalent

-xtrigraphs[ ={ yes | no}]

-trigraphs

No equivalent

-xunroll= n

--param

max-unroll-

-unroll[n]

times=n -xustr= {ascii_utf16_usho rt | no} -xvector=[arg]

No equivalent No equivalent No equivalent

No equivalent No equivalent No equivalent

-xvis

-xwe

-Werror

-Werror

-z[] arg

-Wl,

linkerargument

-Wl,linkerargument

or - Xlinker option

183

C Linker Option Comparisons

The following table compares the Solaris and GNU linker switches and provides mappings where possible. An exact match is not always possible. The match shown is the recommended mapping. See the appropriate linker manuals or manpages for details on all of the options.

Solaris Options

-64

Linux Options

No equivalent

-static

Description

Forces creation of a 64-bit image. Produces static executable. Produces a dynamic, nonposition- independent code (non-pic) executable. Performs direct binding.

-a -b

No equivalent

-B direct -B dynamic

No equivalent

| static -B dynamic Static

Uses dynamic or static libraries. Removes any unversioned symbols. Creates a group of shared objects. Reduces unversioned global symbols to local. Reduces the amount of versioned symbol data. Performs link-time symbolic binding. Uses the configuration file name. C++ demangle diagnostics. Performs dynamic linking (y is the default). Displays specialized debugging information. Sets the image entry point to

epsym.

-B eliminate

No equivalent

-B group

-B group

-B local

No equivalent No equivalent

-Bsymbolic

-B reduce

-B symbolic

-c name -C -d [ y |n ]

-c name

No equivalent

-d [ y |n ]

-D token

No equivalent

-e epsym

-e epsym

184

Solaris Options

-f name

Linux Options

-f name

Description

Shared object used as a filter forname.

-F name

-F name

Shared object used as a filter for

name.

-G

-shared

Produces a shared object (library). Sets the internal library name as

name.

-h name

-h name

-i

Default behavior

-I name

Ignores the LD_LIBRARY_PATH during link. Uses name as the interpreter for the executable. Includes the library the link. Adds path.

path libname

-I name

-l libname

-l libname

in

-L path

-L path

to the library search

-m

-M

Displays a section map on

stdout.

-M map -N string

No equivalent

--add-needed string

Uses the mapfile

string.

map.

Adds an explicit dependency of Produces an object file

-o outfile - p auditlib

-o outfile

outfile.

No equivalent No equivalent

Audits the resulting image at run time using audi tlib (inherited). Audits the resulting image at run time using auditlib (not inherited). Add the linker version information to .comment. Merges relocatable objects. Uses path to locate dependancies at run time. Strips symbolic information. Specifies a support library.

-P auditlib

-Q [y | n]

No equivalent

-r -rpath path

-r -R path

-s -S supportlib

-s

No equivalent

185

Solaris Options

-t

Linux Options

No equivalent See -z muldefs

-u symname

Description

Allows multiple symbols of different types with the same name. Creates an undefined symbol

symname.

-u symname

-V -Y P, dirlist

-V -Y dirlist

Prints the version of

ld.

Uses dirlist instead of the default locations for library search. Promotes absolute symbols from the library to the executable during link. Extracts all elements from archive libraries. Merges relocation sections. Uses default logic to extract elements from the archives. Makes unresolved symbols fatal at link. Enables/disables direct binding. Marks a library as the endpoint of filter. Declares func as a fini routine. Enables/disables group permissions. Records or ignores dependencies not on the link line (the default is record). Declares func as an routine.

init

-z absexec

No equivalent

-z allextract

--whole-archive

-z combreloc -z defaultextract

-z combreloc --no-whole-archive

-z defs

--error-unresolved-symbols

-z direct | nodirect -z endfiltee

No equivalent No equivalent

-fini func

-z finiarray= func - z g r o u p pe r m | nogroupperm -z ignore | record

No equivalent No equivalent

-z initarray=func

-init

-z initfirst

-z initfirst

Runs an object's init routines before any others. Marks an object as an interposer for direct binding.

-z interpose

-z interpose

186

Solaris Options

- z l a z y l oa d | nolazyload -z loadfltr

Linux Options

No equivalent

-z loadfltr

Description

Enables/disables lazy loading of dependencies. Causes a filter to be applied immediately at load time. Allows multiple definitions. Allows undefined symbols. Uses only the object's run path when locating dependencies. Prevents the library from being unloaded at run time. Library cannot be opened with

dlopen(3).

-z muldefs -z nodefs -z nodefaultlib

--allow-multiple-definition --allow-shlib-undefined -z nodefaultlib

-z nodelete

-z nodelete

-z nodlopen

-z nodlopen

-z nodump

-z nodump

Library cannot be dumped with dldump(). dldump() is not available on Linux. Prevents partially initialized symbols. Disables versioning. Disables lazy binding of run-time symbols. Requires immediate $ORIGIN processing at run time. Declares func as a pre-init function. Eliminate all local symbols. Rescans archives until no further extraction. Changes handling of relocations for nonwritable sections. Extracts referenced weak symbols from the archives. Verbose output.

-z nopartial

No equivalent No equivalent

-z now

-z noversion -z now

-z origin

-z origin

-z preinitarry= func

No equivalent

-z redlocsym -z rescan

-x - ( archives -)

-z text | textwarn | textoff -z weakextract

No equivalent No equivalent

--verbose

-z verbose

187

D Make Suffix Rules

Suffix rules are built into make(1), and are used as templates to convert files of one suffix, as in foo.c to one of another suffix, such as foo.o. This appendix lists the suffix rules using the Solaris Make Version 1.0 and GNU Make version 3.79.1 commands. Note that one major difference between the Solaris and GNU Make suffix rule sets is that Solaris defines rules to support the tilde(~) SCCS backup file suffixes.

Suffix

.c.a:

Solaris Rule

$(COMPILE.c) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.C) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(LINT.c) $(OUTPUT _OPTION) -c $< $(OUTPUT _OPTION) $< $(OUTPUT _OPTION) $<

GNU Rule

No rule No rule

.C.a:

.c.ln: .c.o: .C.o: .c: .C: .cc.a:

$(LINT.c)

-C$* $< $(OUTPUT _OPTION) $< $(OUTPUT _OPTION) $<

$(COMPILE.c) $(COMPILE.C) $(LINK.c) $(LINK.C)

$(COMPILE.c) $(COMPILE.C) $(LINK.c) -o [email protected] $(LINK.C) -o [email protected]

-o [email protected] $< $(LDLIBS) -o [email protected] $< $(LDLIBS)

$^ $(LOADLIBES) $(LDLIBS) $^ $(LOADLIBES) $(LDLIBS)

$(COMPILE.cc) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.cc) $(LINK.cc) $(CPS) $(OUTPUT _OPTION) $<

No rule

.cc.o: .cc: .cps.h: .def.sym: .f.a:

$(COMPILE.cc) $(LINK.cc) -o [email protected]

$(OUTPUT _OPTION) $<

-o [email protected] $< $(LDLIBS)

$^ $(LOADLIBES) $(LDLIBS)

$(CPSFLAGS) $*.cp s -o [email protected] $<

No rule

$(COMPILE.def) -o [email protected] $<

$(COMPILE.def)

$(COMPILE.f) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.F) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.f) $(COMPILE.F) $(OUTPUT _OPTION) $< $(OUTPUT _OPTION) $<

No rule No rule

.F.a:

.f.o: .F.o:

$(COMPILE.f) $(COMPILE.F)

$(OUTPUT _OPTION) $< $(OUTPUT _OPTION) $<

188

Suffix

.f90.a:

Solaris Rule

$(COMPILE.f90) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.f90) $(OUTPUT _OPTION) $< $(LINK.f90) $(LINK.f) $(LINK.F) -o [email protected] $< $(LDLIBS)

GNU Rule

No rule No rule No rule

$(LINK.f) -o [email protected] $(LINK.F) -o [email protected] $^ $(LOADLIBES) $(LDLIBS) $^ $(LOADLIBES) $(LDLIBS)

.f90.o: .f90: .f: .F: .ftn.a:

-o [email protected] $< $(LDLIBS) -o [email protected] $< $(LDLIBS)

.ftn.o: .ftn:

$(COMPILE.ftn) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.ftn) $(OUTPUT _OPTION) $< $(LINK.ftn) -o [email protected] $< $(LDLIBS)

No rule

No rule No rule No rule

@$(RM) [email protected] $(LEX.l) $< > [email protected] @$(RM) $*.c $(LEX.l) $< > $*.c -i $*.c -o [email protected] $(RM) $*.c

.java.class: javac $< .l.c: .l.ln: $(RM) [email protected] $(LEX.l) $< > [email protected] $(RM) $* .c $(LEX.l) $< > $* .c $(LINT.c) -o [email protected] -i $ * . c $ ( L I N T . c ) $(RM) $*.c $(LEX.l) $< > $*.c $(COMPILE.c) -o [email protected] $*.c $(RM) $*.c $(RM) $*.c $(LEX.l) $< > $*.c $(LINK.c) -o [email protected] $*.c -ll $(LDLIBS) $(RM) $*.c $(COMPILE.mod) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.mod) $(COMPILE.mod) -o [email protected] $< -o [email protected] -e [email protected] $<

.l.o:

No rule

.l:

No rule

.mod.a:

No rule

.mod.o: .mod: .p.a:

$(COMPILE.mod) $(COMPILE.mod)

-o [email protected] $< -o [email protected] -e [email protected] $^

$(COMPILE.p) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.p) $(LINK.p) $(OUTPUT _OPTION) $<

No rule

.p.o: .p: .r.a:

$(COMPILE.p) $(LINK.p) -o [email protected]

$(OUTPUT _OPTION) $<

-o [email protected] $< $(LDLIBS)

$^ $(LOADLIBES) $(LDLIBS)

$(COMPILE.r) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.r) $(LINK.r) $(OUTPUT _OPTION) $< -o [email protected] $< $(LDLIBS)

No rule

.r.o: .r: .s.a:

$(COMPILE.r) $(LINK.r) -o [email protected]

$(OUTPUT _OPTION) $<

$^ $(LOADLIBES) $(LDLIBS)

$(COMPILE.s) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $%

No rule

189

Suffix

.S.a:

Solaris Rule

$(COMPILE.S) -o $% $< $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(COMPILE.s) $(COMPILE.S) $(RM) [email protected] cat $< > [email protected] chmod +x [email protected] $(YACC.y) $< mv y.tab.c [email protected] $(YACC.y) $< $(LINT.c) -o [email protected] -i y.tab.c $(RM) y.tab.c $(YACC.y) $< $(COMPILE.c) -o [email protected] y.tab.c $(RM) y.tab.c $(YACC.y) $< $(LINK.c) -o [email protected] y.tab.c $(LDLIBS) $(RM) y.tab.c $(GET) $(GFLAGS) -p $< > $*.cc $(COMPILE.cc) -o $% $*.cc $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.cc $ (COMPILE. cc) $ (OUTPUT _OPTION) $*.cc $ (GET) $ (GFLAGS) -p $< > $*.cc $(LINK.cc) -o [email protected] $*.cc $(LDLIBS $(GET) $(CPS) $(GFLAGS) -p $< > $*.cps $(CPSFLAGS) $*.cps -o [email protected] $< -o [email protected] $<

GNU Rule

No rule

.s.o: .S.o: .sh:

$(COMPILE.s) $(COMPILE.S) cat $< >[email protected] chmod a+x [email protected]

-o [email protected] $< -o [email protected] $<

.y.c: .y.ln:

$(YACC.y) $< mv -f y.tab.c [email protected] $(YACC.y) $< $(LINT.c) -C$* y.tab.c $(RM) y.tab.c

.y.o:

No rule No rule No rule

.y:

.cc~.a:

.cc~.o:

No rule No rule No rule No rule

.cc~: .cps~.h: .c~.a:

$(GET) $(GFLAGS) -p $< > $*.c $(COMPILE.c) -o $% $*.c $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.C $(COMPILE.C) -o $% $*.C $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.c $(LINT.c) $(OUTPUT _OPTION) -c $*.c $(GET) $(GFLAGS) -p $< > $*.c $(CC) $(CFLAGS) -c $*.c

.C~.a:

No rule

.c~.ln: .c~.o: .C~.o: .c~:

No rule No rule

No rule $(GET) $(GFLAGS) -p $< > $*.C $(COMPILE.C) $(OUTPUT _OPTION) $* .C

$(GET) $(GFLAGS) -p $< > $*.c $(CC) $(CFLAGS) $(LDFLAGS) -o [email protected] $*. c $ (GET) $ (GFLAGS) -p $< > $*.C $(LINK.C) -o [email protected] $*.C $(LDLIBS $(GET) $(GFLAGS) -p $< > $*.def $(COMPILE.def) -o [email protected] $*.de

No rule No rule No rule

.C~: .def~.sym:

190

Suffix

.f90~.a:

Solaris Rule

$(GET) $(GFLAGS) -p $< > $*.f90 $(COMPILE.f90) -o $% $*.f90 $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.f90 $(COMPILE.f90) $(OUTPUT _OPTION) $*.f90 $(GET) $(GFLAGS) -p $< > $*.f90 $(LINK.f90) -o [email protected] $*.f90 $(LDLIBS $(GET) $(GFLAGS) -p $< > $*.ftn $(COMPILE.ftn) -o $% $*.ftn $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.ftn $ (COMPILE. ftn) $ (OUTPUT _OPTION) $* . ftn $ (GET) $ (GFLAGS) -p $< > $* . ftn $(LINK.ftn) -o [email protected] $* .ftn $(LDLIBS $(GET) $(GFLAGS) -p $< > $*.f $(COMPILE.f) -o $% $*.f $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.F $(COMPILE.F) -o $% $*.F $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.f $(FC) $(FFLAGS) -c $*. $(GET) $(GFLAGS) -p $< > $*.F $(FC) $(FFLAGS) -c $*.F $ (GET) $ (GFLAGS) -p $< > $*. f $(FC) $(FFLAGS) $(LDFLAGS) -o [email protected] $* . f $(GET) $(GFLAGS) -p $< > $*.F $(FC) $(FFLAGS) $(LDFLAGS) -o [email protected] $*. F

GNU Rule

No rule

.f90~.o:

No rule No rule No rule

.f90~: .ftn~.a:

.ftn~.o:

No rule No rule No rule

.ftn~: .f~.a:

.F~.a:

No rule

.f~.o: .F~.o: .f~:

No rule No rule No rule No rule No rule No rule No rule

.F~:

.java~.class: $(GET) $(GFLAGS) -p $< > $*.java javac $< .l~.c: $(GET) $(GFLAGS) -p $< > $*.l $(LEX) $(LFLAGS) $*.l mv lex.yy.c [email protected] .l~.ln: $(GET) $(GFLAGS) -p $< > $*.l $(RM) $*.c $(LEX.l) $*.l > $*.c $(LINT.c) -o [email protected] -i $*.c $(RM) $*.c .l~.o: $(GET) $(GFLAGS) -p $< > $*.l $(LEX) $(LFLAGS) $*.l $(CC) $(CFLAGS) -c lex.yy.c rm -f lex.yy.c mv lex.yy.c [email protected] $(GET) $(GFLAGS) -p $< > $*.l $(LEX) $(LFLAGS) $*.l $(CC) $(CFLAGS) -c lex.yy.c rm -f lex.yy.c mv lex.yy.c [email protected]

No rule

.l~:

No rule

191

Suffix

.mod~.a:

Solaris Rule

$(GET) $(GFLAGS) -p $< > $*.mod $(COMPILE.mod) -o $% $*.mod $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.mod $(COMPILE.mod) -o [email protected] $*.mo $ (GET) $ (GFLAGS) -p $< > $*.mod $(COMPILE.mod) -o [email protected] -e [email protected] $*.mo $(GET) $(GFLAGS) -p $< > $*.p $(COMPILE.p) -o $% $*.p $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.p $ (COMPILE.p) $ (OUTPUT _OPTION) $* .p $ (GET) $ (GFLAGS) -p $< > $*.p $(LINK.p) -o [email protected] $*.p $(LDLIBS $(GET) $(GFLAGS) -p $< > $*.r $(COMPILE.r) -o $% $*.r $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.r $(COMPILE.r) $(OUTPUT _OPTION) $* .r $(GET) $(GFLAGS) -p $< > $*.r $(LINK.r) -o [email protected] $*.r $(LDLIBS $ (GET) $ (GFLAGS) -p $< > $*.sh cp $*.sh [email protected] chmod a+x [email protected] $(GET) $(GFLAGS) -p $< > $*.s $(COMPILE.s) -o $% $*.s $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.SNo $(COMPILE.S) -o $% $*.S $(AR) $(ARFLAGS) [email protected] $% $(RM) $% $(GET) $(GFLAGS) -p $< > $*.s $(COMPILE.s) -o [email protected] $*.s $(GET) $(GFLAGS) -p $< > $*.S $(COMPILE.S) -o [email protected] $*.S $(GET) $(GFLAGS) -p $< > $*.y $(YACC) $(YFLAGS) $*.y mv y.tab.c [email protected] $(GET) $(GFLAGS) -p $< > $*.y $(YACC.y) $*.y $(LINT.c) -o [email protected] -i y.tab.c $(RM) y.tab.c $(GET) $(GFLAGS) -p $< > $*.y $(YACC) $(YFLAGS) $*.y $(CC) $(CFLAGS) -c y.tab.c rm -f y.tab.c mv y.tab.o [email protected] $(GET) $(GFLAGS) -p $< > $*.y $(YACC) $(YFLAGS) $*.y $(COMPILE.c) -o [email protected] y.tab.c $(RM) y.tab.c

GNU Rule

No rule

.mod~.o: .mod~: .p~.a:

No rule No rule No rule

.p~.o: .p~: .r~.a:

No rule No rule No rule

.r~.o: .r~: .sh~:

No rule No rule No rule No rule

.s~.a:

.S~.a:

rule

.s~.o: .S~.o: .y~.c:

No rule No rule No rule No rule

.y~.ln:

.y~.o:

No rule

.y~:

No rule

192

E Porting Checklist

When filled in, this checklist organizes the answers to many of the questions related to planning a porting effort. What is the name of the application to be ported? What is the primary reason for porting the application? In one sentence, describe what the application does: On what operating systems does the current application run? Give specific versions as reported by the software where you can (examples: Solaris 8, Solaris 9, HP-UX 11i v2). To what operating system(s) will the application be ported? Will the port be combined with other significant feature changes (example: a 32-bit to 64- bit code migration)? If so, provide a brief description:

List all languages and versions that are used to construct the application (examples: SUN C Version 5.5, HP ANSI C Version A.05.50, Perl 5.8, Java 1.4.1):

Language Version Comments

193

List all layered, proprietary, and third-party products the application depends on for proper operation (examples: Oracle9i database, OpenGL, Apache):

Product Version Comments

Do you have all of the source code files required to build the application? Yes No What is the quantity of source code? You can specify in terms of lines of code or megabytes of uncompressed source files. What is the quantity of data associated with this application? Can you characterize its structure (examples: formatted, in a database, or unformatted binary)? When was the last time the application environment was completely recompiled and rebuilt from source? Is the application rebuilt regularly? How frequently? Is the application actively maintained by developers who know it well? How is a new build of the application tested or verified for proper operation? Do you have performance characterization tools to assist with optimization? Please list. Will these tests, validation suites, or performance tools need to be ported? Please list. Which source-code and configuration management tools are used (examples: make, SCCS, RCS, CVS)?

194

Do you have a development and test environment separate from your production systems? Yes No

What procedures are in place for rolling in a new version of the application into production?

195

196

Information

Microsoft Word - sol_to_linux_porting_guide_v3_final.doc

198 pages

Report File (DMCA)

Our content is added by our users. We aim to remove reported files within 1 working day. Please use this link to notify us:

Report this file as copyright or inappropriate

989922


You might also be interested in

BETA
Migrating from SunTM SPARC to HP ProLiant Servers with Intel® Xeon® processors