Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Aug 07, 2014
  2. Jul 31, 2014
    • Greg Kroah-Hartman's avatar
      Linux 3.4.101 · 91f7c8cb
      Greg Kroah-Hartman authored
      v3.4.101
      91f7c8cb
    • Catalin Marinas's avatar
      mm: kmemleak: avoid false negatives on vmalloc'ed objects · 7e2e6118
      Catalin Marinas authored
      commit 7f88f88f upstream.
      
      Commit 248ac0e1
      
       ("mm/vmalloc: remove guard page from between vmap
      blocks") had the side effect of making vmap_area.va_end member point to
      the next vmap_area.va_start.  This was creating an artificial reference
      to vmalloc'ed objects and kmemleak was rarely reporting vmalloc() leaks.
      
      This patch marks the vmap_area containing pointers explicitly and
      reduces the min ref_count to 2 as vm_struct still contains a reference
      to the vmalloc'ed object.  The kmemleak add_scan_area() function has
      been improved to allow a SIZE_MAX argument covering the rest of the
      object (for simpler calling sites).
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [hq: Backported to 3.4: Adjust context]
      Signed-off-by: default avatarQiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      7e2e6118
    • Xi Wang's avatar
      introduce SIZE_MAX · b06b5c62
      Xi Wang authored
      commit a3860c1c
      
       upstream.
      
      ULONG_MAX is often used to check for integer overflow when calculating
      allocation size.  While ULONG_MAX happens to work on most systems, there
      is no guarantee that `size_t' must be the same size as `long'.
      
      This patch introduces SIZE_MAX, the maximum value of `size_t', to improve
      portability and readability for allocation size validation.
      
      Signed-off-by: default avatarXi Wang <xi.wang@gmail.com>
      Acked-by: default avatarAlex Elder <elder@dreamhost.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b06b5c62
    • Martin Schwidefsky's avatar
      s390/ptrace: fix PSW mask check · 883ea134
      Martin Schwidefsky authored
      commit dab6cf55
      
       upstream.
      
      The PSW mask check of the PTRACE_POKEUSR_AREA command is incorrect.
      The PSW_MASK_USER define contains the PSW_MASK_ASC bits, the ptrace
      interface accepts all combinations for the address-space-control
      bits. To protect the kernel space the PSW mask check in ptrace needs
      to reject the address-space-control bit combination for home space.
      
      Fixes CVE-2014-3534
      
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      883ea134
    • Linus Torvalds's avatar
      Fix gcc-4.9.0 miscompilation of load_balance() in scheduler · e5c662d1
      Linus Torvalds authored
      commit 2062afb4 upstream.
      
      Michel Dänzer and a couple of other people reported inexplicable random
      oopses in the scheduler, and the cause turns out to be gcc mis-compiling
      the load_balance() function when debugging is enabled.  The gcc bug
      apparently goes back to gcc-4.5, but slight optimization changes means
      that it now showed up as a problem in 4.9.0 and 4.9.1.
      
      The instruction scheduling problem causes gcc to schedule a spill
      operation to before the stack frame has been created, which in turn can
      corrupt the spilled value if an interrupt comes in.  There may be other
      effects of this bug too, but that's the code generation problem seen in
      Michel's case.
      
      This is fixed in current gcc HEAD, but the workaround as suggested by
      Markus Trippelsdorf is pretty simple: use -fno-var-tracking-assignments
      when compiling the kernel, which disables the gcc code that causes the
      problem.  This can result in slightly worse debug information for
      variable accesses, but that is infinitely preferable to actual code
      generation problems.
      
      Doing this unconditionally (not just for CONFIG_DEBUG_INFO) also allows
      non-debug builds to verify that the debug build would be identical: we
      can do
      
          export GCC_COMPARE_DEBUG=1
      
      to make gcc internally verify that the result of the build is
      independent of the "-g" flag (it will make the compiler build everything
      twice, toggling the debug flag, and compare the results).
      
      Without the "-fno-var-tracking-assignments" option, the build would fail
      (even with 4.8.3 that didn't show the actual stack frame bug) with a gcc
      compare failure.
      
      See also gcc bugzilla:
      
        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801
      
      
      
      Reported-by: default avatarMichel Dänzer <michel@daenzer.net>
      Suggested-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5c662d1
    • Naoya Horiguchi's avatar
      mm: hugetlb: fix copy_hugetlb_page_range() · 97f5dabb
      Naoya Horiguchi authored
      commit 0253d634 upstream.
      
      Commit 4a705fef ("hugetlb: fix copy_hugetlb_page_range() to handle
      migration/hwpoisoned entry") changed the order of
      huge_ptep_set_wrprotect() and huge_ptep_get(), which leads to breakage
      in some workloads like hugepage-backed heap allocation via libhugetlbfs.
      This patch fixes it.
      
      The test program for the problem is shown below:
      
        $ cat heap.c
        #include <unistd.h>
        #include <stdlib.h>
        #include <string.h>
      
        #define HPS 0x200000
      
        int main() {
        	int i;
        	char *p = malloc(HPS);
        	memset(p, '1', HPS);
        	for (i = 0; i < 5; i++) {
        		if (!fork()) {
        			memset(p, '2', HPS);
        			p = malloc(HPS);
        			memset(p, '3', HPS);
        			free(p);
        			return 0;
        		}
        	}
        	sleep(1);
        	free(p);
        	return 0;
        }
      
        $ export HUGETLB_MORECORE=yes ; export HUGETLB_NO_PREFAULT= ; hugectl --heap ./heap
      
      Fixes 4a705fef
      
       ("hugetlb: fix copy_hugetlb_page_range() to handle
      migration/hwpoisoned entry"), so is applicable to -stable kernels which
      include it.
      
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reported-by: default avatarGuillaume Morin <guillaume@morinfr.org>
      Suggested-by: default avatarGuillaume Morin <guillaume@morinfr.org>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97f5dabb
    • Sven Wegener's avatar
      x86_32, entry: Store badsys error code in %eax · 4aedd4b0
      Sven Wegener authored
      commit 8142b215 upstream.
      
      Commit 554086d8
      
       ("x86_32, entry: Do syscall exit work on badsys
      (CVE-2014-4508)") introduced a regression in the x86_32 syscall entry
      code, resulting in syscall() not returning proper errors for undefined
      syscalls on CPUs supporting the sysenter feature.
      
      The following code:
      
      > int result = syscall(666);
      > printf("result=%d errno=%d error=%s\n", result, errno, strerror(errno));
      
      results in:
      
      > result=666 errno=0 error=Success
      
      Obviously, the syscall return value is the called syscall number, but it
      should have been an ENOSYS error. When run under ptrace it behaves
      correctly, which makes it hard to debug in the wild:
      
      > result=-1 errno=38 error=Function not implemented
      
      The %eax register is the return value register. For debugging via ptrace
      the syscall entry code stores the complete register context on the
      stack. The badsys handlers only store the ENOSYS error code in the
      ptrace register set and do not set %eax like a regular syscall handler
      would. The old resume_userspace call chain contains code that clobbers
      %eax and it restores %eax from the ptrace registers afterwards. The same
      goes for the ptrace-enabled call chain. When ptrace is not used, the
      syscall return value is the passed-in syscall number from the untouched
      %eax register.
      
      Use %eax as the return value register in syscall_badsys and
      sysenter_badsys, like a real syscall handler does, and have the caller
      push the value onto the stack for ptrace access.
      
      Signed-off-by: default avatarSven Wegener <sven.wegener@stealer.net>
      Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1407221022380.31021@titan.int.lan.stealer.net
      
      
      Reviewed-and-tested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4aedd4b0
    • Romain Degez's avatar
      ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode) · 46005527
      Romain Degez authored
      commit b32bfc06
      
       upstream.
      
      Add support of the Promise FastTrak TX8660 SATA HBA in ahci mode by
      registering the board in the ahci_pci_tbl[].
      
      Note: this HBA also provide a hardware RAID mode when activated in
      BIOS but specific drivers from the manufacturer are required in this
      case.
      
      Signed-off-by: default avatarRomain Degez <romain.degez@gmail.com>
      Tested-by: default avatarRomain Degez <romain.degez@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46005527
    • Tejun Heo's avatar
      libata: introduce ata_host->n_tags to avoid oops on SAS controllers · 80cd492c
      Tejun Heo authored
      commit 1a112d10 upstream.
      
      1871ee13
      
       ("libata: support the ata host which implements a queue
      depth less than 32") directly used ata_port->scsi_host->can_queue from
      ata_qc_new() to determine the number of tags supported by the host;
      unfortunately, SAS controllers doing SATA don't initialize ->scsi_host
      leading to the following oops.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
       IP: [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
       PGD 0
       Oops: 0002 [#1] SMP
       Modules linked in: isci libsas scsi_transport_sas mgag200 drm_kms_helper ttm
       CPU: 1 PID: 518 Comm: udevd Not tainted 3.16.0-rc6+ #62
       Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
       task: ffff880c1a00b280 ti: ffff88061a000000 task.ti: ffff88061a000000
       RIP: 0010:[<ffffffff814e0618>]  [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
       RSP: 0018:ffff88061a003ae8  EFLAGS: 00010012
       RAX: 0000000000000001 RBX: ffff88000241ca80 RCX: 00000000000000fa
       RDX: 0000000000000020 RSI: 0000000000000020 RDI: ffff8806194aa298
       RBP: ffff88061a003ae8 R08: ffff8806194a8000 R09: 0000000000000000
       R10: 0000000000000000 R11: ffff88000241ca80 R12: ffff88061ad58200
       R13: ffff8806194aa298 R14: ffffffff814e67a0 R15: ffff8806194a8000
       FS:  00007f3ad7fe3840(0000) GS:ffff880627620000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000058 CR3: 000000061a118000 CR4: 00000000001407e0
       Stack:
        ffff88061a003b20 ffffffff814e96e1 ffff88000241ca80 ffff88061ad58200
        ffff8800b6bf6000 ffff880c1c988000 ffff880619903850 ffff88061a003b68
        ffffffffa0056ce1 ffff88061a003b48 0000000013d6e6f8 ffff88000241ca80
       Call Trace:
        [<ffffffff814e96e1>] ata_sas_queuecmd+0xa1/0x430
        [<ffffffffa0056ce1>] sas_queuecommand+0x191/0x220 [libsas]
        [<ffffffff8149afee>] scsi_dispatch_cmd+0x10e/0x300 [<ffffffff814a3bc5>] scsi_request_fn+0x2f5/0x550
        [<ffffffff81317613>] __blk_run_queue+0x33/0x40
        [<ffffffff8131781a>] queue_unplugged+0x2a/0x90
        [<ffffffff8131ceb4>] blk_flush_plug_list+0x1b4/0x210
        [<ffffffff8131d274>] blk_finish_plug+0x14/0x50
        [<ffffffff8117eaa8>] __do_page_cache_readahead+0x198/0x1f0
        [<ffffffff8117ee21>] force_page_cache_readahead+0x31/0x50
        [<ffffffff8117ee7e>] page_cache_sync_readahead+0x3e/0x50
        [<ffffffff81172ac6>] generic_file_read_iter+0x496/0x5a0
        [<ffffffff81219897>] blkdev_read_iter+0x37/0x40
        [<ffffffff811e307e>] new_sync_read+0x7e/0xb0
        [<ffffffff811e3734>] vfs_read+0x94/0x170
        [<ffffffff811e43c6>] SyS_read+0x46/0xb0
        [<ffffffff811e33d1>] ? SyS_lseek+0x91/0xb0
        [<ffffffff8171ee29>] system_call_fastpath+0x16/0x1b
       Code: 00 00 00 88 50 29 83 7f 08 01 19 d2 83 e2 f0 83 ea 50 88 50 34 c6 81 1d 02 00 00 40 c6 81 17 02 00 00 00 5d c3 66 0f 1f 44 00 00 <89> 14 25 58 00 00 00
      
      Fix it by introducing ata_host->n_tags which is initialized to
      ATA_MAX_QUEUE - 1 in ata_host_init() for SAS controllers and set to
      scsi_host_template->can_queue in ata_host_register() for !SAS ones.
      As SAS hosts are never registered, this will give them the same
      ATA_MAX_QUEUE - 1 as before.  Note that we can't use
      scsi_host->can_queue directly for SAS hosts anyway as they can go
      higher than the libata maximum.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarMike Qiu <qiudayu@linux.vnet.ibm.com>
      Reported-by: default avatarJesse Brandeburg <jesse.brandeburg@gmail.com>
      Reported-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Tested-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Fixes: 1871ee13
      
       ("libata: support the ata host which implements a queue depth less than 32")
      Cc: Kevin Hao <haokexin@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80cd492c
    • Kevin Hao's avatar
      libata: support the ata host which implements a queue depth less than 32 · 6f0844c4
      Kevin Hao authored
      commit 1871ee13 upstream.
      
      The sata on fsl mpc8315e is broken after the commit 8a4aeec8
      ("libata/ahci: accommodate tag ordered controllers"). The reason is
      that the ata controller on this SoC only implement a queue depth of
      16. When issuing the commands in tag order, all the commands in tag
      16 ~ 31 are mapped to tag 0 unconditionally and then causes the sata
      malfunction. It makes no senses to use a 32 queue in software while
      the hardware has less queue depth. So consider the queue depth
      implemented by the hardware when requesting a command tag.
      
      Fixes: 8a4aeec8
      
       ("libata/ahci: accommodate tag ordered controllers")
      Signed-off-by: default avatarKevin Hao <haokexin@gmail.com>
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f0844c4
    • Christoph Hellwig's avatar
      block: don't assume last put of shared tags is for the host · 696acd9d
      Christoph Hellwig authored
      commit d45b3279
      
       upstream.
      
      There is no inherent reason why the last put of a tag structure must be
      the one for the Scsi_Host, as device model objects can be held for
      arbitrary periods.  Merge blk_free_tags and __blk_free_tags into a single
      funtion that just release a references and get rid of the BUG() when the
      host reference wasn't the last.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      696acd9d
  3. Jul 28, 2014
    • Greg Kroah-Hartman's avatar
      Linux 3.4.100 · 82f9c4a3
      Greg Kroah-Hartman authored
      v3.4.100
      82f9c4a3
    • Takao Indoh's avatar
      iommu/vt-d: Disable translation if already enabled · 21870a3c
      Takao Indoh authored
      commit 3a93c841
      
       upstream.
      
      This patch disables translation(dma-remapping) before its initialization
      if it is already enabled.
      
      This is needed for kexec/kdump boot. If dma-remapping is enabled in the
      first kernel, it need to be disabled before initializing its page table
      during second kernel boot. Wei Hu also reported that this is needed
      when second kernel boots with intel_iommu=off.
      
      Basically iommu->gcmd is used to know whether translation is enabled or
      disabled, but it is always zero at boot time even when translation is
      enabled since iommu->gcmd is initialized without considering such a
      case. Therefor this patch synchronizes iommu->gcmd value with global
      command register when iommu structure is allocated.
      
      Signed-off-by: default avatarTakao Indoh <indou.takao@jp.fujitsu.com>
      Signed-off-by: default avatarJoerg Roedel <joro@8bytes.org>
      [wyj: Backported to 3.4: adjust context]
      Signed-off-by: default avatarYijing Wang <wangyijing@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21870a3c
    • Takashi Iwai's avatar
      PM / sleep: Fix request_firmware() error at resume · 5dc1c885
      Takashi Iwai authored
      commit 4320f6b1 upstream.
      
      The commit [247bc037: PM / Sleep: Mitigate race between the freezer
      and request_firmware()] introduced the finer state control, but it
      also leads to a new bug; for example, a bug report regarding the
      firmware loading of intel BT device at suspend/resume:
        https://bugzilla.novell.com/show_bug.cgi?id=873790
      
      The root cause seems to be a small window between the process resume
      and the clear of usermodehelper lock.  The request_firmware() function
      checks the UMH lock and gives up when it's in UMH_DISABLE state.  This
      is for avoiding the invalid  f/w loading during suspend/resume phase.
      The problem is, however, that usermodehelper_enable() is called at the
      end of thaw_processes().  Thus, a thawed process in between can kick
      off the f/w loader code path (in this case, via btusb_setup_intel())
      even before the call of usermodehelper_enable().  Then
      usermodehelper_read_trylock() returns an error and request_firmware()
      spews WARN_ON() in the end.
      
      This oneliner patch fixes the issue just by setting to UMH_FREEZING
      state again before restarting tasks, so that the call of
      request_firmware() will be blocked until the end of this function
      instead of returning an error.
      
      Fixes: 247bc037 (PM / Sleep: Mitigate race between the freezer and request_firmware())
      Link: https://bugzilla.novell.com/show_bug.cgi?id=873790
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5dc1c885
    • John Stultz's avatar
      alarmtimer: Fix bug where relative alarm timers were treated as absolute · 299e667e
      John Stultz authored
      commit 16927776
      
       upstream.
      
      Sharvil noticed with the posix timer_settime interface, using the
      CLOCK_REALTIME_ALARM or CLOCK_BOOTTIME_ALARM clockid, if the users
      tried to specify a relative time timer, it would incorrectly be
      treated as absolute regardless of the state of the flags argument.
      
      This patch corrects this, properly checking the absolute/relative flag,
      as well as adds further error checking that no invalid flag bits are set.
      
      Reported-by: default avatarSharvil Nanavati <sharvil@google.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sharvil Nanavati <sharvil@google.com>
      Link: http://lkml.kernel.org/r/1404767171-6902-1-git-send-email-john.stultz@linaro.org
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      299e667e
    • Alex Deucher's avatar
      drm/radeon: avoid leaking edid data · b63dd4c8
      Alex Deucher authored
      commit 0ac66eff
      
       upstream.
      
      In some cases we fetch the edid in the detect() callback
      in order to determine what sort of monitor is connected.
      If that happens, don't fetch the edid again in the get_modes()
      callback or we will leak the edid.
      
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b63dd4c8
    • Amitkumar Karwar's avatar
      mwifiex: fix Tx timeout issue · ed379762
      Amitkumar Karwar authored
      commit d76744a9 upstream.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=70191
      https://bugzilla.kernel.org/show_bug.cgi?id=77581
      
      
      
      It is observed that sometimes Tx packet is downloaded without
      adding driver's txpd header. This results in firmware parsing
      garbage data as packet length. Sometimes firmware is unable
      to read the packet if length comes out as invalid. This stops
      further traffic and timeout occurs.
      
      The root cause is uninitialized fields in tx_info(skb->cb) of
      packet used to get garbage values. In this case if
      MWIFIEX_BUF_FLAG_REQUEUED_PKT flag is mistakenly set, txpd
      header was skipped. This patch makes sure that tx_info is
      correctly initialized to fix the problem.
      
      Reported-by: default avatarAndrew Wiley <wiley.andrew.j@gmail.com>
      Reported-by: default avatarLinus Gasser <list@markas-al-nour.org>
      Reported-by: default avatarMichael Hirsch <hirsch@teufel.de>
      Tested-by: default avatarXinming Hu <huxm@marvell.com>
      Signed-off-by: default avatarAmitkumar Karwar <akarwar@marvell.com>
      Signed-off-by: default avatarMaithili Hinge <maithili@marvell.com>
      Signed-off-by: default avatarAvinash Patil <patila@marvell.com>
      Signed-off-by: default avatarBing Zhao <bzhao@marvell.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed379762
    • HATAYAMA Daisuke's avatar
      perf/x86/intel: ignore CondChgd bit to avoid false NMI handling · a2b2c030
      HATAYAMA Daisuke authored
      commit b292d7a1
      
       upstream.
      
      Currently, any NMI is falsely handled by a NMI handler of NMI watchdog
      if CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR is set.
      
      For example, we use external NMI to make system panic to get crash
      dump, but in this case, the external NMI is falsely handled do to the
      issue.
      
      This commit deals with the issue simply by ignoring CondChgd bit.
      
      Here is explanation in detail.
      
      On x86 NMI watchdog uses performance monitoring feature to
      periodically signal NMI each time performance counter gets overflowed.
      
      intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
      handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
      owner of a given NMI by looking at overflow status bits in
      MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
      handles the given NMI as its own NMI.
      
      The problem is that the intel_pmu_handle_irq() doesn't distinguish
      CondChgd bit from other bits. Unlike the other status bits, CondChgd
      bit doesn't represent overflow status for performance counters. Thus,
      CondChgd bit cannot be thought of as a mark indicating a given NMI is
      NMI watchdog's.
      
      As a result, if CondChgd bit is set, any NMI is falsely handled by the
      NMI handler of NMI watchdog. Also, if type of the falsely handled NMI
      is either NMI_UNKNOWN, NMI_SERR or NMI_IO_CHECK, the corresponding
      action is never performed until CondChgd bit is cleared.
      
      I noticed this behavior on systems with Ivy Bridge processors: Intel
      Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
      CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
      in the beginning at boot. Then the CondChgd bit is immediately cleared
      by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
      0.
      
      On the other hand, on older processors such as Nehalem, Xeon E7540,
      CondChgd bit is not set in the beginning at boot.
      
      I'm not sure about exact behavior of CondChgd bit, in particular when
      this bit is set. Although I read Intel System Programmer's Manual to
      figure out that, the descriptions I found are:
      
        In 18.9.1:
      
        "The MSR_PERF_GLOBAL_STATUS MSR also provides a ¡sticky bit¢ to
         indicate changes to the state of performancmonitoring hardware"
      
        In Table 35-2 IA-32 Architectural MSRs
      
        63 CondChg: status bits of this register has changed.
      
      These are different from the bahviour I see on the actual system as I
      explained above.
      
      At least, I think ignoring CondChgd bit should be enough for NMI
      watchdog perspective.
      
      Signed-off-by: default avatarHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Acked-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/20140625.103503.409316067.d.hatayama@jp.fujitsu.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a2b2c030
    • Eric Dumazet's avatar
      ipv4: fix buffer overflow in ip_options_compile() · 26adeae3
      Eric Dumazet authored
      [ Upstream commit 10ec9472 ]
      
      There is a benign buffer overflow in ip_options_compile spotted by
      AddressSanitizer[1] :
      
      Its benign because we always can access one extra byte in skb->head
      (because header is followed by struct skb_shared_info), and in this case
      this byte is not even used.
      
      [28504.910798] ==================================================================
      [28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
      [28504.913170] Read of size 1 by thread T15843:
      [28504.914026]  [<ffffffff81802f91>] ip_options_compile+0x121/0x9c0
      [28504.915394]  [<ffffffff81804a0d>] ip_options_get_from_user+0xad/0x120
      [28504.916843]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
      [28504.918175]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
      [28504.919490]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
      [28504.920835]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
      [28504.922208]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
      [28504.923459]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
      [28504.924722]
      [28504.925106] Allocated by thread T15843:
      [28504.925815]  [<ffffffff81804995>] ip_options_get_from_user+0x35/0x120
      [28504.926884]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
      [28504.927975]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
      [28504.929175]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
      [28504.930400]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
      [28504.931677]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
      [28504.932851]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
      [28504.934018]
      [28504.934377] The buggy address ffff880026382828 is located 0 bytes to the right
      [28504.934377]  of 40-byte region [ffff880026382800, ffff880026382828)
      [28504.937144]
      [28504.937474] Memory state around the buggy address:
      [28504.938430]  ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.939884]  ffff880026382400: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.941294]  ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.942504]  ffff880026382600: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.943483]  ffff880026382700: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.944511] >ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.945573]                         ^
      [28504.946277]  ffff880026382900: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.094949]  ffff880026382a00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.096114]  ffff880026382b00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.097116]  ffff880026382c00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.098472]  ffff880026382d00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.099804] Legend:
      [28505.100269]  f - 8 freed bytes
      [28505.100884]  r - 8 redzone bytes
      [28505.101649]  . - 8 allocated bytes
      [28505.102406]  x=1..7 - x allocated bytes + (8-x) redzone bytes
      [28505.103637] ==================================================================
      
      [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
      
      
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26adeae3
    • Ben Hutchings's avatar
      dns_resolver: Null-terminate the right string · 4427f9ab
      Ben Hutchings authored
      [ Upstream commit 640d7efe
      
       ]
      
      *_result[len] is parsed as *(_result[len]) which is not at all what we
      want to touch here.
      
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Fixes: 84a7c0b1
      
       ("dns_resolver: assure that dns_query() result is null-terminated")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4427f9ab