Commit graph

4642 commits

Author SHA1 Message Date
Peter Wemm 29e9282e2e Cosmetic sync with i386 2006-03-13 23:55:31 +00:00
Paul Saab 12aff6461c Fix the format/display descriptor of vm.kmem_size and vm.kmem_free
to be 'long' instead of 'int' so that sysctl(8) correctly displays
the 8 returned bytes as a single 'long' instead of two 'int' values.

Submitted by:	peter
2006-03-13 08:13:37 +00:00
John Baldwin 8e8f0765ab Flip the switch and don't route interrupts to hyperthreads in a HT system.
In at least one benchmark this showed around a 20% performance increase.
If other workloads do benefit from having hyperthreads service interrupts,
we can always make this a loader tunable.

MFC after:	3 days
Tested by:	ps
2006-03-09 16:38:52 +00:00
Stephan Uphoff 68ff3c2445 Fix exec_map resource leaks.
Tested by: kris@
2006-03-08 20:21:54 +00:00
Yaroslav Tykhiy 4ffbe6ba9f MFi386 revision 1.1220: options TDFX_LINUX --> device tdfx_linux 2006-03-06 15:29:28 +00:00
Sam Leffler 5225f08dc9 guard function decls with _KERNEL so user code can include this file 2006-03-01 05:59:56 +00:00
John Baldwin 215e7c161a Rework how we wire up interrupt sources to CPUs:
- Throw out all of the logical APIC ID stuff.  The Intel docs are somewhat
  ambiguous, but it seems that the "flat" cluster model we are currently
  using is only supported on Pentium and P6 family CPUs.  The other
  "hierarchy" cluster model that is supported on all Intel CPUs with
  local APICs is severely underdocumented.  For example, it's not clear
  if the OS needs to glean the topology of the APIC hierarchy from
  somewhere (neither ACPI nor MP Table include it) and setup the logical
  clusters based on the physical hierarchy or not.  Not only that, but on
  certain Intel chipsets, even though there were 4 CPUs in a logical
  cluster, all the interrupts were only sent to one CPU anyway.
- We now bind interrupts to individual CPUs using physical addressing via
  the local APIC IDs.  This code has also moved out of the ioapic PIC
  driver and into the common interrupt source code so that it can be
  shared with MSI interrupt sources since MSI is addressed to APICs the
  same way that I/O APIC pins are.
- Interrupt source classes grow a new method pic_assign_cpu() to bind an
  interrupt source to a specific local APIC ID.
- The SMP code now tells the interrupt code which CPUs are avaiable to
  handle interrupts in a simpler and more intuitive manner.  For one thing,
  it means we could now choose to not route interrupts to HT cores if we
  wanted to (this code is currently in place in fact, but under an #if 0
  for now).
- For now we simply do static round-robin of IRQs to CPUs when the first
  interrupt handler just as before, with the change that IRQs are now
  bound to individual CPUs rather than groups of up to 4 CPUs.
- Because the IRQ to CPU mapping has now been moved up a layer, it would
  be easier to manage this mapping from higher levels.  For example, we
  could allow drivers to specify a CPU affinity map for their interrupts,
  or we could allow a userland tool to bind IRQs to specific CPUs.

The MFC is tentative, but I want to see if this fixes problems some folks
had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work
fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose
interrupts).

MFC after:	1 week
2006-02-28 22:24:55 +00:00
David Malone 0cbae93607 It seems bit 5 of cpu_feature2 is the VMX (Virtual Machine Extensions)
bit. While I'm here, delete a comment that was cut and past from the
cpu_features code that doesn't belong here.
2006-02-15 14:48:59 +00:00
Poul-Henning Kamp e8444a7e6f CPU time accounting speedup (step 2)
Keep accounting time (in per-cpu) cputicks and the statistics counts
in the thread and summarize into struct proc when at context switch.

Don't reach across CPUs in calcru().

Add code to calibrate the top speed of cpu_tickrate() for variable
cpu_tick hardware (like TSC on power managed machines).

Don't enforce monotonicity (at least for now) in calcru.  While the
calibrated cpu_tickrate ramps up it may not be true.

Use 27MHz counter on i386/Geode.

Use TSC on amd64 & i386 if present.

Use tick counter on sparc64
2006-02-11 09:33:07 +00:00
Poul-Henning Kamp eb2da9a51f Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly.  Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret().  Use the absolute value of
td->pticks in userret() and eliminate third argument.
2006-02-08 08:09:17 +00:00
Poul-Henning Kamp 5b1a8eb397 Modify the way we account for CPU time spent (step 1)
Keep track of time spent by the cpu in various contexts in units of
"cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t
only when somebody wants to inspect the numbers.

For now "cputicks" are still derived from the current timecounter
and therefore things should by definition remain sensible also on
SMP machines.  (The main reason for this first milestone commit is
to verify that hypothesis.)

On slower machines, the avoided multiplications to normalize timestams
at every context switch, comes out as a 5-7% better score on the
unixbench/context1 microbenchmark.  On more modern hardware no change
in performance is seen.
2006-02-07 21:22:02 +00:00
John Baldwin 8917b8d28c - Always call exec_free_args() in kern_execve() instead of doing it in all
the callers if the exec either succeeds or fails early.
- Move the code to call exit1() if the exec fails after the vmspace is
  gone to the bottom of kern_execve() to cut down on some code duplication.
2006-02-06 22:06:54 +00:00
Wayne Salamon 4f9ac41fba Call the audit syscall enter/exit functions for the amd64 architecture,
both 32-bit and 64-bit paths. System calls will now be audited.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-04 20:37:20 +00:00
David Xu 6d7c1bdccd MFi386:
Clear carry flag in get_mconetxt so that setcontext does not
	return a bogus error.
2006-02-03 02:49:14 +00:00
Peter Wemm e2a5e4efdb Make PV entries dynamic on amd64. i386 has a pre-reserved block of kva
dedicated to storing pv entries, originally so that kva didn't have to be
allocated at inconvenient times.  For amd64, we can get the same effect by
using the direct map area.  Allocating pages is the same as with the object
backed method, but now we can just lookup the page in the direct map area.
Thus, no more pageable kva is reserved.  This is the single largest
consumer of kva on our work machines and this change should help conserve
the fixed size 2GB pageable kva on the amd64 kernel.

There are a pair of sysctl nodes introduced, named the same as their
tunable counterparts.  vm.pmap.shpgperproc and vm.pmap.pv_entry_max
They work just like the tunables of the same path, except the values are
linked.  The pv entry cap is now dynamically changeable.

I didn't make them totally unlimited because we need some sort of safety
limit still.  One could consume all physical memory without a cap.
2006-02-03 00:16:36 +00:00
John Baldwin 6966c33482 Call WITNESS_CHECK() in the page fault handler and immediately assume it
is a fatal fault if we are holding any non-sleepable locks.  This should
cut down on the number of bogus LORs we currently get when the kernel
panics due to a NULL (or bogus) pointer dereference that goes wandering
off into the VM system which tries to acquire locks and then kicks off
the spurious LORs.  This should probably be ported to all the archs at
some point.

Tested on:	i386
2006-01-27 22:22:10 +00:00
Scott Long 0af57729a6 Free the newtag if we exit with a failure from alloc_bounce_zone().
Found by: Coverity Prevent(tm)
2006-01-14 17:22:47 +00:00
David E. O'Brien f8ed1e340d Move linux support to the linux section. 2006-01-12 01:20:59 +00:00
Poul-Henning Kamp d3e64681d6 Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43)
to COMPAT_43TTY.

Add COMPAT_43TTY to NOTES and */conf/GENERIC

Compile tty_compat.c only under the new option.

Spit out
	#warning "Old BSD tty API used, please upgrade."
if ioctl_compat.h gets #included from userland.
2006-01-10 09:19:10 +00:00
Warner Losh d5e61c97a6 By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into
param.h.  Per request, I've placed these just after the
_NO_NAMESPACE_POLLUTION ifndef.  I've not renamed anything yet, but
may since we don't need the __.

Submitted by: bde, jhb, scottl, many others.
2006-01-09 06:05:57 +00:00
John Baldwin 04dda605c5 - Make pcib_devclass private to sys/dev/pci/pci_pci.c and change all the
various pcib drivers to use their own private devclass_t variables for
  their modules.
- Use the DEFINE_CLASS_0() macro to declare drivers for the various pcib
  drivers while I'm here.
2006-01-06 19:22:19 +00:00
John Baldwin 360c3c2d1a Fix various places that were testing td_critnest to see if interrupts
should remain disabled during a trap or not to check
td_md.md_spinlock_count instead.
2006-01-06 18:02:12 +00:00
Jung-uk Kim dccb7faff6 - Explicitly validate an empty filter to match bpf_filter() comment[1].
- Do not use BPF JIT compiler for an empty filter.

[1] Pointed out by:	darrenr
2006-01-03 20:26:03 +00:00
Warner Losh 501755f4f6 Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for
each platform.  These will be used in the pci code in preference to
the complicated #ifdefs we have there now.
2006-01-01 20:59:28 +00:00
Alexander Leidinger e3d101c377 Unbreak kernel build.
A happy new year to all.

Submitted by:	Goran Gajic <ggajic@afrodita.rcub.bg.ac.yu>, bz
Pointy hat to:	netchild
Appologies to:	all
2006-01-01 05:35:57 +00:00
Alexander Leidinger ef39c05baa MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
   this allows to try different coloring algorithms without the need to
   touch every file [1]
 - make the page queue tuning values readable: sysctl vm.stats.pagequeue
 - autotuning of the page coloring values based upon the cache size instead
   of options in the kernel config (disabling of the page coloring as a
   kernel option is still possible)

MD changes:
 - detection of the cache size: only IA32 and AMD64 (untested) contains
   cache size detection code, every other arch just comes with a dummy
   function (this results in the use of default values like it was the
   case without the autotuning of the page coloring)
 - print some more info on Intel CPU's (like we do on AMD and Transmeta
   CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by:	Chad David <davidc@acns.ab.ca> [1]
Reviewed by:		alc, arch (in 2004)
Discussed with:		alc, Chad David, arch (in 2004)
2005-12-31 14:39:20 +00:00
Pawel Jakub Dawidek 70665fda32 Fix watch address truncation. The address was truncated when it was passed to
amd64_set_watch() as 'unsigned int' and 'unsigned int' is 32bit long on amd64.

Even with that fix hardware watchpoint don't work for me on amd64, ie. when
I set the watchpoint and write a byte there, nothing happens.
2005-12-27 23:23:47 +00:00
Maxim Sobolev 900b28f9f6 Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure
with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually
allow executing elf dynamic binaries (aka shared libraries). When it is
requested to execute ET_DYN elf image check if this flag is on after we
know the elf brand allowing execution if so.

PR:		kern/87615
Submitted by:	Marcin Koziej <creep@desk.pl>
2005-12-26 21:23:57 +00:00
Jeff Roberson 660002d398 - Improve the INKERNEL macro such that it can no longer give false positives.
This fixes the stack(9) functionality.

Submitted by:	Antoine Brodin <antoine.brodin@laposte.net>
2005-12-23 21:33:55 +00:00
John Baldwin b439e431bf Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly.  In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures.  Basically,
  all the archs were taking a trapframe and converting it into a clockframe
  one way or another.  Now they can just extract the PC and usermode values
  directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
  accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
  the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
  timecounter, call hardclock() directly.  This removes an extra
  conditional check from every clock interrupt on Alpha on the BSP.
  There is probably room for even further pruning here by changing Alpha
  to use the simplified timecounter we use on x86 with the lapic timer
  since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
  is false, so add a KASSERT() to that affect and remove a condition
  to slightly optimize the non-lapic case.
- Change prototypeof  arm_handler_execute() so that it's first arg is a
  trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on:	alpha, amd64, arm, i386, ia64, sparc64
Reviewed by:	bde (mostly)
2005-12-22 22:16:09 +00:00
John Baldwin 5b2119223e Move the hostb driver out of the i386 and amd64 PCI code (where it was
duplicated anyways) and into a single MI driver.  Extend the driver a bit
to implement the bus and PCI kobj interfaces such that other drivers can
attach to it and transparently act as if their parent device is the PCI
bus (for the most part).
2005-12-20 21:09:45 +00:00
Marcel Moolenaar 757686b115 Make our ELF64 type definitions match standards. In particular this
means:
o  Remove Elf64_Quarter,
o  Redefine Elf64_Half to be 16-bit,
o  Redefine Elf64_Word to be 32-bit,
o  Add Elf64_Xword and Elf64_Sxword for 64-bit entities,
o  Use Elf_Size in MI code to abstract the difference between
   Elf32_Word and Elf64_Word.
o  Add Elf_Ssize as the signed counterpart of Elf_Size.

MFC after: 2 weeks
2005-12-18 04:52:37 +00:00
Scott Long 0717619c5c Don peril sensitive sunglasses and jack up the MAX_BPAGES limit to 8192
on amd64.  If you're going to stuff >4GB into your box, reserving 32MB for
bonce pages amounts to a rounding error in the overall scheme of things.
2005-12-16 05:57:18 +00:00
John Baldwin 410d857972 Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1)
which existed to cleanup the linux_osname mutex.  Now that MTX_SYSINIT()
has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here
was redundant and resulted in panics in debug kernels.

MFC after:	1 week
Reported by:	Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu
2005-12-15 16:30:41 +00:00
John Baldwin 05ee80c796 Fix stale comment. 2005-12-14 21:47:02 +00:00
John Baldwin e83f6bcb75 Revert previous commit. The BIOS braindamage is even worse than I
originally thought.  The BIOS that cleared CPUID_APIC actually managed
to disable the local APIC entirely and even Windows 64 doesn't boot on
it.

Reported by:	bz
2005-12-13 18:29:10 +00:00
John Baldwin 15b7edbeaa Don't check the CPUID_APIC bit in the cpu_features flags field to determine
if the boot CPU has a local APIC because some BIOS vendors are not
competent enough to set this bit.  Instead, just assume that we always have
a local APIC on amd64.  For i386 the check is a bit more subtle.  FreeBSD
requires either an MP Table or an ACPI MADT table to enumerate APICs.  The
only systems that have one of those tables that don't have local APICs are
some presumably rare (and old) SMP 486 systems using external APICs.  Thus,
instead of checking the CPUID_APIC flag, check the CPU class and abort if
we are running on a 486.

MFC after:	1 week
Reported by:	bz
2005-12-13 15:09:40 +00:00
Peter Wemm 6bcdd71391 For the amd64 platform, we can depend on the TSC being present. This patch
changes DELAY to use the TSC once it has been calibrated.  This does NOT
use the TSC for long-term timekeeping.   It only uses it to bound the
DELAY() spinloop.  This should not be affected by the Athlon64 X2 TSC
quirks because the cpu is not halted while we use DELAY().
2005-12-12 22:27:07 +00:00
David Xu 992ee51fc0 Sync with i386, fix compiling for non-SMP. 2005-12-09 13:30:34 +00:00
John Baldwin 333b8de537 MFi386:
- Move PUSH_FRAME and POP_FRAME to asmacros.h and use PUSH_FRAME in
  atpic entry points.
- Move PCPU_* asm macros out of the middle of the asm profiling macros.
- Pass IRQ vector argument as an int rather than void * to reduce diffs
  with i386.
- EOI the lapic in C for the lapic timer handler.
- GC unused Xcpuast function.
- Split IPI_STOP handling code of ipi_nmi_handler() out into a
  cpustop_handler() function and call it from Xcpustop rather than
  duplicating all the logic in assembly.
- Fixup the list of symbols with interrupt frames in ddb traces.
  Xatpic_fastintr* have never existed on amd64, and the lapic timer
  handler and various IPI handlers were missing.
- Use trapframe instead of intrframe for interrupt entry points (on amd64
  the interrupt vector was already a separate argument, so the two frames
  were already identical) and GC intrframe.

Submitted by:	peter (3)
2005-12-08 18:33:30 +00:00
Peter Wemm 79880f7327 Catch up to the system siginfo changes. Use a union for the ia32 layout
of siginfo just like the system one.  There are now two fields to copy
instead of one.
2005-12-06 23:06:29 +00:00
John Baldwin 696effb697 - Cleanup whitespace and extra ()s in vtophys() macros.
- Move vtophys() macros next to vtopte() where vtopte() exists to match
  comments above vtopte().
- Remove references to the alternate address space in the comment above
  vtopte().  amd64 never had the alternate address space, and i386 lost it
  prior to PAE support being added.
- s/entires/entries/ in comments.

Reviewed by:	alc
2005-12-06 21:09:01 +00:00
Jung-uk Kim 50c9fad9ce Fix ZERO_EDX() macro from the previous commit. It was emitting
`xor %ecx, %ecx', not `xor %edx, %edx'.
2005-12-06 20:11:07 +00:00
Ruslan Ermilov 224d140293 Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with
MACHINE_ARCH and MACHINE).  Their purpose was to be able to test
in cpp(1), but cpp(1) only understands integer type expressions.
Using such unsupported expressions introduced a number of subtle
bugs, which were discovered by compiling with -Wundef.
2005-12-06 13:27:21 +00:00
Jung-uk Kim 6a96c4832f s/M_WAITOK/M_NOWAIT/ while mutex is held.
Pointed out by:	csjp
2005-12-06 07:22:01 +00:00
Jung-uk Kim 23a8fc28c2 - Micro-optimize mov $0, %edx' -> xor %edx, %edx'.
- Correct amd64 macro style (no functional change).
2005-12-06 06:45:39 +00:00
Jung-uk Kim ae275efcae Add experimental BPF Just-In-Time compiler for amd64 and i386.
Use the following kernel configuration option to enable:

	options BPF_JITTER

If you want to use bpf_filter() instead (e. g., debugging), do:

	sysctl net.bpf.jitter.enable=0

to turn it off.

Currently BIOCSETWF and bpf_mtap2() are unsupported, and bpf_mtap() is
partially supported because 1) no need, 2) avoid expensive m_copydata(9).

Obtained from:	WinPcap 3.1 (for i386)
2005-12-06 02:58:12 +00:00
John Baldwin 5ae84c09e7 Really slam the door on mixed mode now that we don't depend on it for a
working IRQ0 with APIC anymore.  Previously, it was possible to have
some other ATPIC IRQS "leak" through in a few edge cases.  For example, on
my x86 test machine, ACPI re-routes the SCI (IRQ 9) to intpin 13 on the
first I/O APIC.  This leaves a hole for IRQ 13 (since the APIC doesn't
provide a source for IRQ 13 in that case) with the result that the ATPIC
IRQ13 source was registered instead.  This changes the 8259A drivers to
only register their interrupt sources if none of the 16 ISA IRQs have an
interrupt source already installed.

MFC after:	1 week
2005-12-05 22:09:30 +00:00
Eric Anholt 69b9fffc84 Merge DRM CVS as of 2005-12-02, adding i915 DRM support thanks to Alexey Popov,
and a new r300 PCI ID.
2005-12-03 01:23:50 +00:00
Eric Anholt 9fb0767374 Update DRM to CVS snapshot as of 2005-11-28. Notable changes:
- S3 Savage driver ported.
- Added support for ATI_fragment_shader registers for r200.
- Improved r300 support, needed for latest r300 DRI driver.
- (possibly) r300 PCIE support, needs X.Org server from CVS.
- Added support for PCI Matrox cards.
- Software fallbacks fixed for Rage 128, which used to render badly or hang.
- Some issues reported by WITNESS are fixed.
- i915 module Makefile added, as the driver may now be working, but is untested.
- Added scripts for copying and preprocessing DRM CVS for inclusion in the
  kernel.  Thanks to Daniel Stone for getting me started on that.
2005-11-28 23:13:57 +00:00
John Baldwin d6ef938e56 If we get a stray interrupt, return after logging it. In the extremely
rare case of a stray interrupt to an unregistered source (such as a stray
interrupt from the 8259As when using APIC), this could result in a page
fault when it tried to walk the list of interrupt handlers to execute
INTR_FAST handlers.  This bug was introduced with the intr_event changes,
so it's not present in 5.x or 6.x.

Submitted by:	Mark Tinguely tinguely at casselton dot net
2005-11-28 20:18:43 +00:00
Ruslan Ermilov 6646524f34 - Allow duplicate "machine" directives with the same arguments.
- Move existing "machine" directives to DEFAULTS.
2005-11-27 23:17:00 +00:00
Lukas Ertl ae5a74ec72 Fix typo. 2005-11-24 15:28:32 +00:00
Ruslan Ermilov 1a581012df Add missing "struct" in i386/i386/machdep.c,v 1.497 by deischen@. 2005-11-24 08:16:18 +00:00
John Baldwin 7417e80b4e Don't enable PUC_FASTINTR by default in the source. Instead, enable it
via the DEFAULTS kernel configs.  This allows folks to turn it that option
off in the kernel configs if desired without having to hack the source.
This is especially useful since PUC_FASTINTR hangs the kernel boot on my
ultra60 which has two uart(4) devices hung off of a puc(4) device.

I did not enable PUC_FASTINTR by default on powerpc since powerpc does not
currently allow sharing of INTR_FAST with non-INTR_FAST like the other
archs.
2005-11-21 20:22:35 +00:00
John Baldwin 16e0d8caf8 Expand the hack to mask the atpics if 'device atpic' is not in the kernel
during boot up.  Now we do a full reset of the 8259As and setup a simple
interrupt handler (we actually borrow the apic one that just does an
immediate iret) to handle any spurious interrupts triggered by either chip.
This should fix some folks that were getting a Trap 30 during bootup of
certain SMP AMD systems.  This might get pushed into the 6.0 branch as an
errata.  For now a suitable workaround is to add 'device atpic' to your
kernel config.

Tested by:	scottl
Helpful info from:	dillon
MFC after:	1 week
2005-11-21 18:39:17 +00:00
Alan Cox 97a0c226d6 Eliminate pmap_init2(). It's no longer used. 2005-11-20 06:09:49 +00:00
John Baldwin 7d0a7ec90c - Always print the trap number so that we have something to start with for
mystery traps.  If we don't have a message for a given trap, just use
  UNKNOWN for the message.
- Add trap messages for T_XMMFLT and T_RESERVED.

MFC after:	1 week
2005-11-18 19:26:46 +00:00
David E. O'Brien 5ab591d4d9 Fix spelling mistake.
Submitted by:	kris
2005-11-17 02:32:39 +00:00
John Baldwin db477d6cc8 Revert a part of the previous commits to these files that made the NMI
IPI_STOP handling code use atomic_readandclear() to execute the restart
function on the first CPU to resume and restore the behavior of always
executing the restart function on the BSP since this is in fact what the
non-NMI IPI_STOP handler does.  I did add back in a statement to clear
the restart function pointer after it is executed to match the behavior
of the non-NMI IPI_STOP handler.
2005-11-16 20:58:40 +00:00
John Baldwin fdb9ce3716 Revert previous commit to these files. There isn't a race necessitating
an xchg instruction as we only try to execute the startup function if
the CPU ID is 0 (i.e. the BSP).  I missed this earlier.
2005-11-16 20:55:57 +00:00
John Baldwin b60119eb02 Fix a typo in the check for an invalid APIC. If we are told about an
I/O APIC that doesn't exist, then a read of the version register is going
to return -1 which is 0xffffffff not 0xffffff.

Tested on:	i386
Tested by:	Nikos Ntarmos ntarmos at ceid dot upatras dot gr
MFC after:	1 week
2005-11-16 20:29:29 +00:00
Alan Cox 65336314cf In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock
cannot possibly occur.
2005-11-13 02:17:05 +00:00
Ruslan Ermilov 6d8200ff0c Add /dev/speaker support to amd64.
The following repo-copies were made (by Mark Murray):

sys/i386/isa/spkr.c -> sys/dev/speaker/spkr.c
sys/i386/include/speaker.h -> sys/dev/speaker/speaker.h
share/man/man4/man4.i386/spkr.4 -> share/man/man4/spkr.4
2005-11-11 09:57:32 +00:00
Alan Cox 7a35a21e7b Reimplement the reclamation of PV entries. Specifically, perform
reclamation synchronously from get_pv_entry() instead of
asynchronously as part of the page daemon.  Additionally, limit the
reclamation to inactive pages unless allocation from the PV entry zone
or reclamation from the inactive queue fails.  Previously, reclamation
destroyed mappings to both inactive and active pages.  get_pv_entry()
still, however, wakes up the page daemon when reclamation occurs.  The
reason being that the page daemon may move some pages from the active
queue to the inactive queue, making some new pages available to future
reclamations.

Print the "reclaiming PV entries" message at most once per minute, but
don't stop printing it after the fifth time.  This way, we do not give
the impression that the problem has gone away.

Reviewed by: tegge
2005-11-09 08:19:21 +00:00
Marcel Moolenaar 38195fdcaf Add uart(4). When both sio(4) and uart(4) can handle a serial port,
sio(4) will claim it. This change therefore only affects how ports
are handled when they are not claimed by sio(4), and in principle
will improve hardware support.

MFC after: 2 months
2005-11-05 19:48:53 +00:00
Peter Wemm 55cd3ef2e5 Define M_IOAPIC the same as i386 2005-11-04 23:02:28 +00:00
Ruslan Ermilov 160ebe8754 Catch up with the recent <sys/signal.h> change and make this compile. 2005-11-04 20:32:26 +00:00
Alan Cox e9cb1037da Begin and end the initialization of pvzone in pmap_init().
Previously, pvzone's initialization was split between pmap_init() and
pmap_init2().  This split initialization was the underlying cause of
some UMA panics during initialization.  Specifically, if the UMA boot
pages was exhausted before the pvzone was fully initialized, then UMA,
through no fault of its own, would use an inappropriate back-end
allocator leading to a panic.  (Previously, as a workaround, we have
increased the UMA boot pages.)  Fortunately, there is no longer any
reason that pvzone's initialization cannot be completed in
pmap_init().

Eliminate a check for whether pv_entry_high_water has been initialized
or not from get_pv_entry().  Since pvzone's initialization is
completed in pmap_init(), this check is no longer needed.

Use cnt.v_page_count, the actual count of available physical pages,
instead of vm_page_array_size to compute the maximum number of pv
entries.

Introduce the vm.pmap.pv_entries tunable on alpha and ia64.

Eliminate some unnecessary white space.

Discussed with: tegge (item #1)
Tested by: marcel (ia64)
2005-11-04 18:03:24 +00:00
Paul Saab 1471f287e1 Calling setrlimit from 32bit apps could potentially increase certain
limits beyond what should be capiable in a 32bit process, so we
must fixup the limits.

Reviewed by:	jhb
2005-11-02 21:18:07 +00:00
John Baldwin c7362ff7fb Change the x86 code to allocate IDT vectors on-demand when an interrupt
source is first enabled similar to how intr_event's now allocate ithreads
on-demand.  Previously, we would map IDT vectors 1:1 to IRQs.  Since we
only have 191 available IDT vectors for I/O interrupts, this limited us
to only supporting IRQs 0-190 corresponding to the first 190 I/O APIC
intpins.  On many machines, however, each PCI-X bus has its own APIC even
though it only has 1 or 2 devices, thus, we were reserving between 24 and
32 IRQs just for 1 or 2 devices and thus 24 or 32 IDT vectors.  With this
change, a machine with 100 IRQs but only 5 in use will only use up 5 IDT
vectors.  Also, this change provides an API (apic_alloc_vector() and
apic_free_vector()) that will allow a future MSI interrupt source driver to
request IDT vectors for use by MSI interrupts on x86 machines.

Tested on:	amd64, i386
2005-11-02 20:11:47 +00:00
John Baldwin d394d454b0 Throw the switch and turn on STOP_NMI on in GENERIC for amd64 and i386.
Requested by:	kris
Ok'd by:	scottl
2005-11-01 22:59:03 +00:00
Jung-uk Kim e8d472a7af Catch up with ACPI-CA 20051021 import 2005-11-01 22:44:08 +00:00
Alan Cox f7118bdf3b Instead of a panic()ing in pmap_insert_entry() if get_pv_entry()
fails, reclaim a pv entry by destroying a mapping to an inactive
page.

Change the format strings in many of the assertions that were recently
converted from PMAP_DIAGNOSTIC printf()s so that they are compatible
with PAE.  Avoid unnecessary differences between the amd64 and i386
format strings.
2005-10-31 21:25:33 +00:00
John Baldwin 296c4b1ad5 Hook nve(4) up in i386 and amd64 NOTES.
MFC after:	1 week
2005-10-31 20:45:37 +00:00
Robert Watson 5bb84bc84b Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
Alan Cox 6fb8d0e3a7 Replace diagnostic printf()s by assertions. Use consistent style for
similar assertions.
2005-10-30 20:47:42 +00:00
Peter Wemm 54903605b3 MFi386: bring over DEFAULTS (repocopy) and adapt. While there isn't a
4.x->6.x amd64 upgrade path, the config files are kept in approximate sync.
2005-10-27 18:54:43 +00:00
David E. O'Brien 537b6cf3ea Remove atpic as we've changed to using the lapic timer vs. using irq0 2005-10-27 18:40:56 +00:00
John Baldwin 85d72e4a2e Create a default kernel config for i386 and move 'device isa' and
'device npx' (both of which aren't really optional right now) and
'device io' and 'device mem' (to preserve POLA for 4.x users upgrading
to 6.0) from GENERIC into DEFAULTS.

Requested by:	scottl
Reviewed by:	scottl
2005-10-27 17:34:35 +00:00
Peter Wemm a9fbe5d07b MFi386: Various apic fixes and tweaks
* Don't recursively panic if we've already paniced and the local apic is
  now stuck.
* Add hw.apic.* tunables/sysctls for extint controls
* Change "lapic%d timer" to "cpu%d timer" intname to match i386
2005-10-26 22:32:30 +00:00
Peter Wemm 8f155a8f2e Change PHYSMAP_SIZE to allow for more memory segments. The old value was
too low for certain Dell amd64 machines.
2005-10-26 22:16:52 +00:00
John Baldwin e0f66ef861 Reorganize the interrupt handling code a bit to make a few things cleaner
and increase flexibility to allow various different approaches to be tried
in the future.
- Split struct ithd up into two pieces.  struct intr_event holds the list
  of interrupt handlers associated with interrupt sources.
  struct intr_thread contains the data relative to an interrupt thread.
  Currently we still provide a 1:1 relationship of events to threads
  with the exception that events only have an associated thread if there
  is at least one threaded interrupt handler attached to the event.  This
  means that on x86 we no longer have 4 bazillion interrupt threads with
  no handlers.  It also means that interrupt events with only INTR_FAST
  handlers no longer have an associated thread either.
- Renamed struct intrhand to struct intr_handler to follow the struct
  intr_foo naming convention.  This did require renaming the powerpc
  MD struct intr_handler to struct ppc_intr_handler.
- INTR_FAST no longer implies INTR_EXCL on all architectures except for
  powerpc.  This means that multiple INTR_FAST handlers can attach to the
  same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach
  to the same interrupt.  Sharing INTR_FAST handlers may not always be
  desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun
  either.  Drivers can always still use INTR_EXCL to ask for an interrupt
  exclusively.  The way this sharing works is that when an interrupt
  comes in, all the INTR_FAST handlers are executed first, and if any
  threaded handlers exist, the interrupt thread is scheduled afterwards.
  This type of layout also makes it possible to investigate using interrupt
  filters ala OS X where the filter determines whether or not its companion
  threaded handler should run.
- Aside from the INTR_FAST changes above, the impact on MD interrupt code
  is mostly just 's/ithread/intr_event/'.
- A new MI ddb command 'show intrs' walks the list of interrupt events
  dumping their state.  It also has a '/v' verbose switch which dumps
  info about all of the handlers attached to each event.
- We currently don't destroy an interrupt thread when the last threaded
  handler is removed because it would suck for things like ppbus(8)'s
  braindead behavior.  The code is present, though, it is just under
  #if 0 for now.
- Move the code to actually execute the threaded handlers for an interrrupt
  event into a separate function so that ithread_loop() becomes more
  readable.  Previously this code was all in the middle of ithread_loop()
  and indented halfway across the screen.
- Made struct intr_thread private to kern_intr.c and replaced td_ithd
  with a thread private flag TDP_ITHREAD.
- In statclock, check curthread against idlethread directly rather than
  curthread's proc against idlethread's proc. (Not really related to intr
  changes)

Tested on:	alpha, amd64, i386, sparc64
Tested on:	arm, ia64 (older version of patch by cognet and marcel)
2005-10-25 19:48:48 +00:00
Bill Paul ba3af76df7 Modify the pci_cfgdisable() routine to bring it more in line with
other OSes (Solaris, Linux, VxWorks). It's not necessary to write a 0
to the config address register when using config mechanism 1 to turn
off config access. In fact, it can be downright troublesome, since it
seems to confuse the PCI-PCI bridge in the AMD8111 chipset and cause
it to sporadically botch reads from some devices. This is the cause
of the missing USP ports problem I was experiencing with my Sun Opteron
system.

Also correct the case for mechanism 2: it's only necessary to write
a 0 to the ENABLE port.
2005-10-25 04:53:29 +00:00
John Baldwin 58553b9925 Rename the KDB_STOP_NMI kernel option to STOP_NMI and make it apply to all
IPI_STOP IPIs.
- Change the i386 and amd64 MD IPI code to send an NMI if STOP_NMI is
  enabled if an attempt is made to send an IPI_STOP IPI.  If the kernel
  option is enabled, there is also a sysctl to change the behavior at
  runtime (debug.stop_cpus_with_nmi which defaults to enabled).  This
  includes removing stop_cpus_nmi() and making ipi_nmi_selected() a
  private function for i386 and amd64.
- Fix ipi_all(), ipi_all_but_self(), and ipi_self() on i386 and amd64 to
  properly handle bitmapped IPIs as well as IPI_STOP IPIs when STOP_NMI is
  enabled.
- Fix ipi_nmi_handler() to execute the restart function on the first CPU
  that is restarted making use of atomic_readandclear() rather than
  assuming that the BSP is always included in the set of restarted CPUs.
  Also, the NMI handler didn't clear the function pointer meaning that
  subsequent stop and restarts could execute the function again.
- Define a new macro HAVE_STOPPEDPCBS on i386 and amd64 to control the use
  of stoppedpcbs[] and always enable it for i386 and amd64 instead of
  being dependent on KDB_STOP_NMI.  It works fine in both the NMI and
  non-NMI cases.
2005-10-24 21:04:19 +00:00
John Baldwin 301268b8ca When restarting the BSP during cpu_reset() use a membar to ensure that
the updated cpustop_restartfunc is seen when the BSP resumes execution.
This matches the membar already present in restart_cpus().
2005-10-24 20:53:52 +00:00
John Baldwin 95d84e5461 Use xchg in Xcpustop to close a race and make cpustop_restartfunc truly
one-shot in the SMP case (before using the simple mov / cmp / mov sequence
could allow multiple CPUs to execute the restart function on resume).
2005-10-24 20:52:26 +00:00
John Baldwin 6b1e0d75b0 - Various small whitespace and style nits.
- Use PCPU_GET(cpumask) in preference to 1 << PCPU_GET(cpuid) in a few
  places.
2005-10-24 20:31:04 +00:00
Paul Saab bbf719c8ba include opt_compat.h to unbreak the build 2005-10-24 00:00:00 +00:00
Ade Lovett 8d228514fb Specifically panic() in the case where pmap_insert_entry() fails to
get a new pv under high system load where the available pv entries
have been exhausted before the pagedaemon has a chance to wake up
to reclaim some.

Prior to this, the NULL pointer dereference ended up causing
secondary panics with rather less than useful resulting tracebacks.

Reviewed by:	alc, jhb
MFC after:	1 week
2005-10-21 19:42:43 +00:00
Jung-uk Kim 7c799f4520 Redo physical/logical CPU count.
Suggested by:	jhb
2005-10-17 23:23:20 +00:00
David Xu 9313eb5537 Micro optimization for context switch. Eliminate code for saving gs.base
and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase
when user wants to set them, in context switch routine, we only need to
write them into registers, we never have to read them out from registers
when thread is switched away. Since rdmsr is a serialization instruction,
micro benchmark shows it is worthy to do.

Reviewed by: peter, jhb
2005-10-17 23:10:31 +00:00
John Baldwin 05490142bb Another bit of sx(4) removal. 2005-10-17 18:35:57 +00:00
Jung-uk Kim 42fb42a399 Split displaying number of physical and logical cores. 2005-10-17 15:51:28 +00:00
David E. O'Brien f5dce7aa6e For AMD processors, nullify CPUID.HTT. FreeBSD has no need for the
information it conveys, and it is only confusing people.
This fixes incorrect output in the previous commit.
2005-10-16 08:58:27 +00:00
Jung-uk Kim 25736eb670 Correct few MSR addresses.
PR:		amd64/85852
Submitted by:	Nate Eldredge <nge at cs dot hmc dot edu>
2005-10-15 00:44:56 +00:00
Jung-uk Kim 9c3acb0bc1 - Print number of physical/logical cores and more CPUID info.
- Add newer CPUID definitions for future use.

Many thanks to Mike Tancsa <mike at sentex dot net> for providing test
cases for Intel Pentium D and AMD Athlon 64 X2.

Approved by:	anholt (mentor)
2005-10-14 22:52:01 +00:00
John Baldwin 728ef95410 The signal code is now an int rather than a long, so update debug printfs. 2005-10-14 20:22:57 +00:00
Ruslan Ermilov 6f6b430e2f Sort ath_rate_* entries. Mark ath_rate_sample as the desired algorithm.
Discussed with:	sam
2005-10-14 17:22:28 +00:00
David Xu 9104847f21 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
Gleb Smirnoff 9c26aa3c12 Polling is now configured with help of ifconfig(8), not sysctl.
Prodded by:     maxim
2005-10-07 09:23:51 +00:00
Peter Wemm 07f5921b86 Don't set segment registers via ptrace yet. Its not ready. 2005-10-04 23:26:00 +00:00
Gleb Smirnoff 4092996774 Big polling(4) cleanup.
o Axe poll in trap.

o Axe IFF_POLLING flag from if_flags.

o Rework revision 1.21 (Giant removal), in such a way that
  poll_mtx is not dropped during call to polling handler.
  This fixes problem with idle polling.

o Make registration and deregistration from polling in a
  functional way, insted of next tick/interrupt.

o Obsolete kern.polling.enable. Polling is turned on/off
  with ifconfig.

Detailed kern_poll.c changes:
  - Remove polling handler flags, introduced in 1.21. The are not
    needed now.
  - Forget and do not check if_flags, if_capenable and if_drv_flags.
  - Call all registered polling handlers unconditionally.
  - Do not drop poll_mtx, when entering polling handlers.
  - In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
  - In netisr_poll() axe the block, where polling code asks drivers
    to unregister.
  - In netisr_poll() and ether_poll() do polling always, if any
    handlers are present.
  - In ether_poll_[de]register() remove a lot of error hiding code. Assert
    that arguments are correct, instead.
  - In ether_poll_[de]register() use standard return values in case of
    error or success.
  - Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
    poll_switch() goes through interface list and enabled/disables polling.
    A message that kern.polling.enable is deprecated is printed.

Detailed driver changes:
  - On attach driver announces IFCAP_POLLING in if_capabilities, but
    not in if_capenable.
  - On detach driver calls ether_poll_deregister() if polling is enabled.
  - In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
    flag. If there is no, then unlocks and returns.
  - In ioctl handler driver checks for IFCAP_POLLING flag requested to
    be set or cleared. Driver first calls ether_poll_[de]register(), then
    obtains driver lock and [dis/en]ables interrupts.
  - In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
    If present, then returns.This is important to protect from spurious
    interrupts.

Reviewed by:	ru, sam, jhb
2005-10-01 18:56:19 +00:00
Robert Watson 5f419982c2 Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57,
osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60,
svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81,
svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55,
svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10,
ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58,
unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:

Now that Giant is acquired in uprintf() and tprintf(), the caller no
longer leads to acquire Giant unless it also holds another mutex that
would generate a lock order reversal when calling into these functions.
Specifically not backed out is the acquisition of Giant in nfs_socket.c
and rpcclnt.c, where local mutexes are held and would otherwise violate
the lock order with Giant.

This aligns this code more with the eventual locking of ttys.

Suggested by:	bde
2005-09-28 07:03:03 +00:00
Peter Wemm d176c062c9 I believe the stack underflows during early development that caused me to
add spare padding at the beginning of the pcb are long gone.  Remove the
padding fields.
2005-09-27 21:11:35 +00:00
Peter Wemm 1acc225f91 Kill pcb_rflags. It served no purpose.
Reported by:  bde
2005-09-27 21:10:10 +00:00
Peter Wemm 98df9e0010 Fix a minor nit that has been bugging me for a while. Fix the obvious
cases of using a 64 bit operation to zero a register.  32 bit opcodes are
smaller and supposedly faster, and clear the upper 32 bits for free.
2005-09-27 18:32:46 +00:00
Peter Wemm 901b68c185 Add a bare minimum (but wrong) R_X86_64_JMP_SLOT relocation type for
kernel modules.  We actually need to include any addends and the symbol
offset value, but for gcc/binutils didn't set it anywhere I've found on
'cc -fpic -shared' kernel modules.
2005-09-27 18:18:23 +00:00
Peter Wemm 04bc142a09 Don't report Maxmem as 'real memory'. It is really the highest address
available and can give the wrong impression when there are memory holes.
Report the total amount of usable memory that we detected instead of the
highest address.
2005-09-27 18:15:57 +00:00
Peter Wemm 686447adcc MFi386: If we take a trap with interrupts disabled while in a critical
section, don't enable them if we're servicing an NMI.
2005-09-27 18:13:07 +00:00
Peter Wemm 458d22f302 Don't let the upper bits of %dr6/%dr7 get set.
Submitted by:  Nate Eldredge <neldredge@math.ucsd.edu>
2005-09-27 18:10:26 +00:00
Peter Wemm add121a476 Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added
stubs for ia64 to keep it compiling.  These are used by 32 bit apps such
as gdb.
2005-09-27 18:04:20 +00:00
John Baldwin 3c2bc2bf26 Add a new atomic_fetchadd() primitive that atomically adds a value to a
variable and returns the previous value of the variable.

Tested on:	i386, alpha, sparc64, arm (cognet)
Reviewed by:	arch@
Submitted by:	cognet (arm)
MFC after:	1 week
2005-09-27 17:39:11 +00:00
Poul-Henning Kamp 57489b8657 __RMAN_RESOURCE_VISIBLE is not actually needed. 2005-09-25 20:03:41 +00:00
Stephan Uphoff 2a988f7cb5 Fix the "fpudna: fpcurthread == curthread XXX times" problem.
Tested by: kris@
Reviewed by:	peter@
MFC after:	3 days
2005-09-22 15:46:21 +00:00
Robert Watson 84d2b7df26 Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),
as they both interact with the tty code (!MPSAFE) and may sleep if the
tty buffer is full (per comment).

Modify all consumers of uprintf() and tprintf() to hold Giant around
calls into these functions.  In most cases, this means adding an
acquisition of Giant immediately around the function.  In some cases
(nfs_timer()), it means acquiring Giant higher up in the callout.

With these changes, UFS no longer panics on SMP when either blocks are
exhausted or inodes are exhausted under load due to races in the tty
code when running without Giant.

NB: Some reduction in calls to uprintf() in the svr4 code is probably
desirable.

NB: In the case of nfs_timer(), calling uprintf() while holding a mutex,
or even in a callout at all, is a bad idea, and will generate warnings
and potential upset.  This needs to be fixed, but was a problem before
this change.

NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having
non-MPSAFE tty code.

MFC after:	1 week
2005-09-19 16:51:43 +00:00
Christian S.J. Peron 33cdc78d01 Introduce a kernel config for the Mandatory Access Control framework.
This kernel config briefly describes some of the major MAC policies
available on FreeBSD. The hope is that this will raise the awareness
about MAC and get more people interested.

Discussed with:	scottl
2005-09-18 03:15:36 +00:00
Warner Losh 62061bf002 MFi386: pci attribute allocation fixes. 2005-09-18 01:42:43 +00:00
John Baldwin 80d52f16da Stop using the '+' constraint modifier with inline assembly. The '+'
constraint is actually only allowed for register operands.  Instead, use
separate input and output memory constraints.

Education from:	alc
Reviewed by:	alc
Tested on:	i386, alpha
MFC after:	1 week
2005-09-15 19:31:22 +00:00
David E. O'Brien 2a191126de Canonize the include of acpi.h. 2005-09-11 18:39:03 +00:00
Marcel Moolenaar 216e80c2ba Move the prototypes of db_md_set_watchpoint(), db_md_clr_watchpoint()
and db_md_list_watchpoints() to ddb/ddb.h.
2005-09-10 03:01:25 +00:00
Scott Long dc8540a9a0 Hook up the hptmv driver for amd64.
MFC After: 3 days
2005-09-08 03:29:18 +00:00
Alan Cox 3be99ffc1a Eliminate unnecessary TLB invalidations by pmap_enter(). Specifically,
eliminate TLB invalidations when permissions are relaxed, such as when a
read-only mapping is changed to a read/write mapping.  Additionally,
eliminate TLB invalidations when bits that are ignored by the hardware,
such as PG_W ("wired mapping"), are changed.

Reviewed by:	tegge
2005-09-04 19:06:27 +00:00
Alan Cox ba8bca610c Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine
whether the mapping should permit execute access.
2005-09-03 18:20:20 +00:00
Joseph Koshy 7ef5ed2bb1 - Special-case NMI handling on the AMD64.
On entry or exit from the kernel the 'alltraps' and 'doreti' code
  used taken by normal traps disables interrupts to protect the
  critical sections where it is setting up %gs.

  This protection is insufficient in the presence of NMIs since NMIs
  can be taken even when the processor has disabled normal interrupts.
  Thus the NMI handler needs to actually read MSR_GBASE on entry to
  the kernel to determine whether a swap of %gs using 'swapgs' is
  needed.  However, reads of MSRs are expensive and integrating this
  check into the 'alltraps'/'doreti' path would penalize normal
  interrupts.

- Teach DDB about the 'nmi_calltrap' symbol.

Reviewed by:	bde, peter (older versions of this change)
2005-08-27 16:03:40 +00:00
Alan Cox 8c190069a2 Remedy the following three problems:
1. The amd64 pmap, unlike the i386 pmap, maintains a reference count
   for each page directory (PD) page.  However, in the transformation
   of the i386 pmap into the amd64 pmap, operations, such as
   pmap_copy() and pmap_object_init_pt(), that create 2MB "superpage"
   mappings by setting the PG_PS bit in a PD entry were not modified
   to adjust the underlying PD page's reference count.  Consequently,
   superpage mappings could disappear prematurely.

2. pmap_object_init_pt() could crash or corrupt memory if either the
   virtual address range being mapped crosses a 1GB boundary in the
   virtual address space or nothing is mapped in the 1GB area.

3. When pmap_allocpte() destroys a 2MB "superpage" mapping it does not
   reduce the pmap's resident count accordingly.  It should.  (This
   bug is inherited from i386.)

Discussed with: peter
Reviewed by:    tegge
2005-08-26 05:18:46 +00:00
Stephan Uphoff 0e7bd54c71 NMI handler should not enable interrupts.
Tested by: kris@
MFC after:	3 weeks
2005-08-25 20:33:43 +00:00
Alan Cox 7380c1acb5 Pass the PDE from pmap_remove() to pmap_remove_page() so that the latter
procedure doesn't have to recompute it.
2005-08-22 20:02:40 +00:00
Alan Cox 2c15852e1e Change pmap_extract() and pmap_extract_and_hold() to use PG_FRAME rather
than ~PDRMASK to extract the physical address of a superpage from a PDE.
The use of ~PDRMASK is problematic if the PDE has PG_NX set.  Specifically,
the PG_NX bit will be included in the physical address if ~PDRMASK is used.

Reviewed by:	peter
2005-08-22 07:23:51 +00:00
Alan Cox 8b78b3dc58 Introduce pmap_pml4e_to_pdpe() and pmap_pdpe_to_pde() and use them to avoid
recomputation of the pml4e and pdpe in pmap_copy(), pmap_protect(), and
pmap_remove().
2005-08-20 18:37:34 +00:00
Stefan Farfeleder a1f85d7f83 Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename
it to __MINSIGSTKSZ.  Define MINSIGSTKSZ in <sys/signal.h>.

This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN
in <pthread.h> (soon <limits.h>) without having to include the whole
<sys/signal.h> header.

Discussed with:		bde
2005-08-20 16:44:41 +00:00
Pawel Jakub Dawidek a95452ee8d Avoid code duplication and implement bitcount32() function in systm.h only.
Reviewed by:	cperciva
MFC after:	3 days
2005-08-19 22:10:19 +00:00
Alan Cox eb3d4a39fd Correct a performance bug in revision 1.462. The effect of the bug is to
execute the outer loop in procedures such as pmap_protect() many more times
than necessary.

Reviewed by:	tegge
2005-08-19 07:25:40 +00:00
John Baldwin 5d2f4de5da Add aliases for atomic operations on 64-bit integers just like other
64-bit platforms.

MFC after:	1 week
2005-08-18 14:36:47 +00:00
Alan Cox 96e5109430 Simplify the page table page reference counting by pmap_enter()'s change of
mapping case.

Eliminate a stale comment from pmap_enter().

Reviewed by:	tegge
2005-08-14 20:02:50 +00:00
Alan Cox 50b334506f Eliminate unneeded diagnostic code.
Eliminate an unused #include.  (Kernel stack allocation and deallocation
long ago migrated to the machine-independent code.)
2005-08-11 23:38:02 +00:00
Alan Cox b69dd0fda6 Eliminate unneeded diagnostic code.
Reviewed by:	tegge
2005-08-11 17:43:28 +00:00
Alan Cox 8e7a85fac9 Decouple the unrefing of a page table page from the removal of a pv entry.
In other words, change pmap_remove_entry() such that it no longer unrefs
the page table page.  Now, it only removes the pv entry.

Reviewed by:	tegge
2005-08-11 02:22:55 +00:00
Alan Cox 5f2c46d5ed When support for 2MB/4MB pages was added in revision 1.148 an error was
made in pmap_protect(): The pmap's resident count should not be reduced
unless mappings are removed.

The errant change to the pmap's resident count could result in a later
pmap_remove() failing to remove any mappings if the errant change has set
the pmap's resident count to zero.
2005-08-07 22:00:47 +00:00
Jeff Roberson d0f5f62d8b - Add support for saving stack traces and displaying them via printf(9)
and KTR.

Contributed by:		Antoine Brodin <antoine.brodin@laposte.net>
Concept code from:	Neal Fachan <neal@isilon.com>
2005-08-03 04:33:48 +00:00
Jeff Roberson bb2fd058ce - Improve the definition of INKERNEL() to include the DMAP area and the
proper start of the kernel area.

Discussed with:	peter
2005-08-03 04:21:51 +00:00
John Baldwin 813a5e14ec Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c
so that they aren't duplicated 3 times and are also in the same file as
the code that depends on the SYSVIPC modules.
2005-07-29 19:40:39 +00:00
Maxime Henrion 5a29706504 Add back ed(4) in amd64 GENERIC. It now works nicely and since those
chips are commonly found, it makes sense to have it in GENERIC.  This
is a candidate for a RELENG_6 MFC.

Approved by;	peter
Requested by:	pav
Tested by:	pav
2005-07-24 17:55:57 +00:00
Ruslan Ermilov c5bf476aa9 Fallout from the previous revision: lnc isn't quite ready for amd64 yet. 2005-07-22 16:02:40 +00:00
David E. O'Brien b8b77732ff Fix $FreeBSD$. 2005-07-22 04:03:25 +00:00
Peter Wemm 9e76f9ad3f Like on i386, bypass lock prefix for atomic ops on !SMP kernels. 2005-07-21 22:35:02 +00:00
Peter Wemm 4bf21bfef9 MFi386: add vpd driver (vital product data.. model & serial numbers etc) 2005-07-21 21:57:31 +00:00
Peter Wemm 6ec2971397 Add the ed driver for lint building. The PCI instances are still useful.
In theory, there are no isa slots on any amd64/em64t systems, but it
doesn't hurt to keep these tiny fragments compiling.
2005-07-21 21:55:11 +00:00
Peter Wemm 84d7e08229 Actually create the double fault stack page for AP cpus so that we have a
chance of getting a working double fault instead of converting it to an
instant triple fault reset.
2005-07-21 21:46:09 +00:00
Poul-Henning Kamp 636d90fc5c Make the facility for recognizing BIOS-signatures more general
and return a printable representation.

This fixes recognition of the PC Engines WRAP and improves the
recognition of the Soekris boards (Bios version can now be
seen in the dmesg output for instance).

Also, add watchdog support for PCM-582x platforms.

Submitted by:	Adrian Steinmann <ast@marabu.ch>
Slightly changed by:	phk
PR:	81360
2005-07-21 09:48:37 +00:00